Turning the generics problem on its head

1,175 views
Skip to first unread message

Evan Huus

unread,
Mar 18, 2014, 6:34:44 PM3/18/14
to golan...@googlegroups.com
I realize generics have been *extensively* discussed before, and I don't intend to reopen any old wounds. I've waded through a lot of previous threads, and read Russ's canonical blog post [1] a few times, so I believe I have a grasp on the current state of the discussion. I hope this post results in some constructive, useful, and new ideas.

In general, I agree with Russ's assertion that there are three basic approaches to generics: compile-time duplication, runtime boxing, and no generics at all. Go currently does not support generics per say, but of course the user is free to manually do either of the other two:
  • compile-time duplication can be done by literal copy-paste of the code and adjusting the types
  • runtime boxing can be done by use of interface{}
Since both of these approaches involve trade-offs (either in compiling-time, running-time, or running-space), the fact that Go lets the use choose which trade-off to make is perhaps a *good* thing; the user can choose the trade-off that makes sense for their scenario. The implicit premise here is that any generics implementation must trade-off on one of those factors; despite extensive search I have not found a generics implementation that was "free" in the relevant sense.

So, if we accept that the choice of which trade-off to make is good, and we know that Go already supports both options, then in some sense we're done. Go should never implement one or the other as *the* way of doing generics in Go. However, what we *can* do is make it easier and safer to manually do compile-time duplication and runtime boxing. Since these two techniques are both much smaller than the rather scary concept of "generics", approaching the problem this way turns one large problem into two smaller ones, which can be tackled separately, or not at all.

[1] http://research.swtch.com/generic

Compile-Time Code Duplication

The most annoying thing about manual compile-time duplication is that it currently requires manual copy-paste of the code and adjustment of the relevant types (with a concomitant lost of maintainability). But this is easy to automate, if we're not focused on implementing it as a method for generics and instead implement it as a feature in its own right. The actual syntax or name isn't worth bikeshedding over at this point, but for the sake of an example we could have a keyword "duplicate" such that one could write:
type T interface{}
type foo struct { a T; }
type bar foo.duplicate(T=int)
Again, this isn't a serious proposal; just an example to illustrate the point that the problem becomes much simpler when viewed outside the shadow of generics. The use of a name like duplicate or copy or ... makes the cost in space explicit to the programmer while still making it easy for them to accomplish what they want.

Runtime Boxing

The most annoying thing about manual runtime boxing with interface{} is the loss of type-safety; for a simple collection class, the user must manually ensure that nothing of the wrong type accidentally gets added or deal with the resulting mess of type switches and assertions. But again, this is easy for the compiler to check if we don't approach it as generics per-say and just as a new language convenience. Again, the actual syntax and naming isn't worth bikeshedding, but for the sake of an example we could have a keyword "assert" such that one could write:
type T interface{}
type foo struct { a T; }
var bar foo.assert(T=int)
The extra ".assert(T=int)" would not change the compiled code at all, it would simply hint to the compiler that anywhere type T is used with respect to bar, it must be an int and not just a generic interface.

---

I'm sure that this approach has some flaws; constructive criticism is very welcome. Please don't discuss specific details/problems of the examples, they really are just examples to illustrate the broader point. Thoughts? Comments? Use cases this doesn't address?

Rémy Oudompheng

unread,
Mar 18, 2014, 7:00:23 PM3/18/14
to Evan Huus, golang-nuts
2014-03-18 23:34 GMT+01:00 Evan Huus <eap...@gmail.com>:
> So, if we accept that the choice of which trade-off to make is good, and we
> know that Go already supports both options, then in some sense we're done.
> Go should never implement one or the other as *the* way of doing generics in
> Go. However, what we *can* do is make it easier and safer to manually do
> compile-time duplication and runtime boxing. Since these two techniques are
> both much smaller than the rather scary concept of "generics", approaching
> the problem this way turns one large problem into two smaller ones, which
> can be tackled separately, or not at all.
>
> [1] http://research.swtch.com/generic

Why do you think your suggestions are "smaller than the concept of
generics" ? They do not look less complicated than a regular concept
of generic programming.

>
> Compile-Time Code Duplication
>
> The most annoying thing about manual compile-time duplication is that it
> currently requires manual copy-paste of the code and adjustment of the
> relevant types (with a concomitant lost of maintainability). But this is
> easy to automate, if we're not focused on implementing it as a method for
> generics and instead implement it as a feature in its own right. The actual
> syntax or name isn't worth bikeshedding over at this point, but for the sake
> of an example we could have a keyword "duplicate" such that one could write:
>>
>> type T interface{}
>> type foo struct { a T; }
>> type bar foo.duplicate(T=int)
>
> Again, this isn't a serious proposal; just an example to illustrate the
> point that the problem becomes much simpler when viewed outside the shadow
> of generics. The use of a name like duplicate or copy or ... makes the cost
> in space explicit to the programmer while still making it easy for them to
> accomplish what they want.

This looks like a form of templating. But I don't even understand how
to make a generic tree data structure in your example. For each
proposal that implies a change in the language, you have to try to
imagine what the rules would be ? When are types
identical/assignable/convertible ?

> Runtime Boxing
>
> The most annoying thing about manual runtime boxing with interface{} is the
> loss of type-safety; for a simple collection class, the user must manually
> ensure that nothing of the wrong type accidentally gets added or deal with
> the resulting mess of type switches and assertions. But again, this is easy
> for the compiler to check if we don't approach it as generics per-say and
> just as a new language convenience. Again, the actual syntax and naming
> isn't worth bikeshedding, but for the sake of an example we could have a
> keyword "assert" such that one could write:
>>
>> type T interface{}
>> type foo struct { a T; }
>> var bar foo.assert(T=int)
>
> The extra ".assert(T=int)" would not change the compiled code at all, it
> would simply hint to the compiler that anywhere type T is used with respect
> to bar, it must be an int and not just a generic interface.

This is similar to the type erasure process that happens for example
in Java generics. Again, you have to think, in such a context, about
what the rules should be ? Can you assign a foo<T=int> to a
foo<T=interface{}> ? How do you define a tree data structure in this
context ? Can you write a generic sort algorithm ?

Can you write a generic map function ? (something that takes a []T and
a "func(T) U" and returns a []U).

If you can't do these, what can you do with this proposal ?

> I'm sure that this approach has some flaws; constructive criticism is very
> welcome. Please don't discuss specific details/problems of the examples,
> they really are just examples to illustrate the broader point. Thoughts?
> Comments? Use cases this doesn't address?

I'm pretty sure you can find earlier discussions about these concepts.
I find it very difficult to imagine how the concepts above can be
turned into something usable (e.g. I can't imagine using them to solve
the questions I ask). For the first example, mixing the concept of
interface with a variable type is very confusing.

Last year I wanted to gather ideas and remarks at
https://github.com/remyoudompheng/go-misc/blob/master/generics/generics.markdown
but I seem to have stopped.

Rémy.

Evan Huus

unread,
Mar 18, 2014, 7:12:31 PM3/18/14
to golan...@googlegroups.com, Evan Huus
 
The point is that the rules don't change from current go; the proposed syntax is literally just short-hand telling the compiler to do a copy-paste of the code with search-replace of the relevant types. This makes it massively simpler to implement.

> Runtime Boxing
>
> The most annoying thing about manual runtime boxing with interface{} is the
> loss of type-safety; for a simple collection class, the user must manually
> ensure that nothing of the wrong type accidentally gets added or deal with
> the resulting mess of type switches and assertions. But again, this is easy
> for the compiler to check if we don't approach it as generics per-say and
> just as a new language convenience. Again, the actual syntax and naming
> isn't worth bikeshedding, but for the sake of an example we could have a
> keyword "assert" such that one could write:
>>
>> type T interface{}
>> type foo struct { a T; }
>> var bar foo.assert(T=int)
>
> The extra ".assert(T=int)" would not change the compiled code at all, it
> would simply hint to the compiler that anywhere type T is used with respect
> to bar, it must be an int and not just a generic interface.

This is similar to the type erasure process that happens for example
in Java generics. Again, you have to think, in such a context, about
what the rules should be ?

Again, the rules don't change. This is just short-hand telling the compiler to add a few extra compile-time type assertions. Remove the extra assert() bit and it's just a typical data structure built on an interface{} - the rules for those are well-defined already.
 
Can you assign a foo<T=int> to a
foo<T=interface{}> ? How do you define a tree data structure in this
context ? Can you write a generic sort algorithm ?

Can you write a generic map function ? (something that takes a []T and
a "func(T) U" and returns a []U).

If you can't do these, what can you do with this proposal ?

> I'm sure that this approach has some flaws; constructive criticism is very
> welcome. Please don't discuss specific details/problems of the examples,
> they really are just examples to illustrate the broader point. Thoughts?
> Comments? Use cases this doesn't address?

I'm pretty sure you can find earlier discussions about these concepts.
I find it very difficult to imagine how the concepts above can be
turned into something usable (e.g. I can't imagine using them to solve
the questions I ask).

The questions you asked are ones that can already be solved in Go in one of two ways: copy-pasting the code for each type, or using interface{} and runtime assertions. The above ideas are about making those ways shorter to type and safer to compile; they really do not change the logical solutions or the rules of the language at all.
 
For the first example, mixing the concept of
interface with a variable type is very confusing.

The examples are just examples, however I probably should have explained them a bit more thoroughly. In the code-duplication case, the idea is simply that the compiler duplicates the code, and replaces any occurrences of type "T" with type "int" in the copy.

simon place

unread,
Mar 18, 2014, 7:22:17 PM3/18/14
to golan...@googlegroups.com
i was wondering if Go could use a pre-compiler, written in go, to amongst other things, automatically do generic "compile-time duplication".

it'd be just another tool really, so not really part of the language.

and once you had the duplicates it would not add to compile time much.

with "runtime boxing" isn't it possible for the compiler to notice you boxing/unboxing unnecessary and miss it out?

Rémy Oudompheng

unread,
Mar 18, 2014, 7:22:36 PM3/18/14
to Evan Huus, golang-nuts
To make it simpler, you first need to explain what the rules are.
Saying that "Go rules don't change" doesn't. You seem to say it's very
simple because it's just "copy-paste and search-replace". I disagree
for a number of reasons: the first one is that you talk about
"copy-pasting the code", without explaining what "the code" is.

If I say (syntax doesn't matter here)

type bar foo.duplicate(T=int)

What is copy-pasted?
What is the type of bar?
Is it "struct { a int }" (an unnamed type)?
Is it a named type named "foo" (impossible because there is already a
type named foo)?
Or something else (a new Go type species, but impossible because you
said Go rules don't change)?
Does it have methods (impossible because being a named type is not
possible as said in the latter sentence)?
Supposing having methods is possible, does it replace the type name in methods?
Does it also replace type name in method bodies? Where does it stop?
What if the "T" is used in code not intended to be specialized at all?

So there are hundreds of questions, each of them requires definition
of a precise rule for the newly introduced concepts, even if you feel
like you don't need rules.
Here again, you need rules. The devil is in the detail you have
carefully hidden by saying "just add a few extra assertions". Which
assertions ?

Rémy.

Evan Huus

unread,
Mar 18, 2014, 7:29:59 PM3/18/14
to golan...@googlegroups.com, Evan Huus
Go already has a very nice formal specification [1]. The *examples* I proposed would not require changing any existing component of that specification, they would simply add to it.

In the code-duplication case, the code:

type T interface{}
type foo struct { a T; }
type bar foo.duplicate(T=int)
would be *exactly equivalent* to
type T interface{}
type foo struct { a T; }
type bar struct { a int; }
The hint (whatever the final syntax) would specify to the compiler a base type (in this example foo) and a list of type substitutions (in this example T->int). The compiler would take a copy of the code from the base type, perform the appropriate substitutions, and insert the result in place of the "duplicate" hint. Compilation proceeds as normal, failing if the type substitution results in invalid code.

[1] http://golang.org/ref/spec

Rémy Oudompheng

unread,
Mar 18, 2014, 7:36:57 PM3/18/14
to Evan Huus, golang-nuts

Can you explain how to define a tree structure in this setup?

Rémy.

Evan Huus

unread,
Mar 18, 2014, 7:38:23 PM3/18/14
to golan...@googlegroups.com, Evan Huus
In the runtime-boxing case, the hint would not change the compiled output at all; compiling

type T interface{}
type foo struct { a T; }
var bar foo.assert(T=int)
would result in *exactly the same* binary (down to the very last bit) as compiling

type T interface{}
type foo struct { a T; }
var bar foo
The only change is that with the hint, the compiler has the option of doing extra work to verify that everywhere bar.a is used, it is an int and not just a T. This makes the program safer when the user knows the intended type in advance.

Regardless, the examples were just that - examples. I apologize for explaining them poorly and causing more confusion, but they do not affect my broader point, which is the following:

The question "How should Go support generics?" is misleading. The more useful questions are "How should Go support safe boxing/unboxing?" and "How should Go support safe type-duplication?". My examples were poorly-thought-out answers to those two questions, but by far the more important point was that those are the questions we should be answering.

Evan Huus

unread,
Mar 18, 2014, 7:49:11 PM3/18/14
to golan...@googlegroups.com, Evan Huus

On Tuesday, March 18, 2014 7:36:57 PM UTC-4, Rémy Oudompheng wrote:

Can you explain how to define a tree structure in this setup?

This setup can be used as-is with any of the half-dozen existing tree packages that already work using interface{}. For example, we could write a package like:
package foo
type K interface{}
type V interface{}
type TreeNode struct {
    Left, Right *TreeNode
    Key K
    Value V
}
Note that this is already valid Go code; the package author does not need to do anything to support generics in this model except use sane aliases for their interface{}. Then a user of the package could write:
import "foo"
type myNode foo.TreeNode.duplicate(K=int, V=string)
and the result would be an implementation of foo.TreeNode with the appropriate type substitutions.

Mark Mandel

unread,
Mar 18, 2014, 8:37:42 PM3/18/14
to simon place, golan...@googlegroups.com
Not 100% sure what this would look like, but I am kinda intrigued by this idea.

Jonathan Amsterdam

unread,
Mar 18, 2014, 11:06:23 PM3/18/14
to golan...@googlegroups.com, Evan Huus


On Tuesday, March 18, 2014 7:38:23 PM UTC-4, Evan Huus wrote:
In the runtime-boxing case, the hint would not change the compiled output at all; compiling
type T interface{}
type foo struct { a T; }
var bar foo.assert(T=int)
would result in *exactly the same* binary (down to the very last bit) as compiling
type T interface{}
type foo struct { a T; }
var bar foo

Can I write bar.a++ ? If not, this doesn't accomplish much. If so, then there must be more going on than mere erasure.

Matt Sherman

unread,
Mar 19, 2014, 12:17:48 AM3/19/14
to golan...@googlegroups.com
Evan, you might be interested in my approach with gen: http://clipperhouse.github.io/gen/

It chooses the 'space' trade-off by codegen'ing for each type. The gen'd code just becomes part of your package, so it's compile-time safe, no type assertions.

It's not a generalized solution just yet. I'd like to make it more pluggable, where one can add templates ad-hoc, and it is currently limited to one type parameter. Would like to find a way to allow multiple type parameters, as in your tree example.

Mandolyte

unread,
Mar 19, 2014, 9:04:06 PM3/19/14
to golan...@googlegroups.com
About Russ Cox's #2 approach... which the gen package is some flavor of right?
Back on December 10 (see 1), Michael Jones at Google made a comment that he was experimenting with ideas for Go generics. Based on other comments he's made in the past, I suspect his approach was along these lines. He never, to my knowledge, revealed what he was thinking. Since several months have passed, perhaps he gave up.

I don't think the #2 approach is all that bad:
a) You shouldn't pay the penalty if you don't use it (slow compiles or code bloat, that is)
b) If you copy/paste to achieve the same result, you have the same amount of code - not sure why it would be slower than auto-generation of same

There are some questions about where do you hide the auto-gen'd code; how do you debug it and so forth... nothing that seems insurmountable.

At any rate, if generics ever do come to Go, my money is on a smarter #2 approach.

(1) The thread was "lack of generics - what's the alternative?" my starred link, which may not be usable for anyone else, is:

egon

unread,
Mar 20, 2014, 3:12:03 AM3/20/14
to golan...@googlegroups.com
This type of generics has been suggested before (I was one of them... :)). My take was that the type specialization happens at package level (it stops a lot of abuse of the templating and you can use the packages without having to use generics).. In essence a tree package could look like:

type Elem interface{
  Less(other Elem) bool
}
type Tree struct {
  value Elem
  left, right *Tree
}

And when importing you do something like:

import valtree "tree" { Elem: Val }

The imported package would be equivalent to other packages imported with the same specializations. Which means the only addition to the language syntax would be the importing specialization... of course I'm not sure about the compiler side (I can imagine the code duplication checking would be easier, because it's possible to deal with it at package import level; also I think it would be easier to implement than type level specialization).

+ egon

Rémy Oudompheng

unread,
Mar 20, 2014, 3:25:59 AM3/20/14
to egon, golang-nuts
2014-03-20 8:12 GMT+01:00 egon <egon...@gmail.com>:
> This type of generics has been suggested before (I was one of them... :)).
> My take was that the type specialization happens at package level (it stops
> a lot of abuse of the templating and you can use the packages without having
> to use generics).. In essence a tree package could look like:
>
> type Elem interface{
> Less(other Elem) bool
> }
> type Tree struct {
> value Elem
> left, right *Tree
> }
>
> And when importing you do something like:
>
> import valtree "tree" { Elem: Val }
>
> The imported package would be equivalent to other packages imported with the
> same specializations. Which means the only addition to the language syntax
> would be the importing specialization... of course I'm not sure about the
> compiler side (I can imagine the code duplication checking would be easier,
> because it's possible to deal with it at package import level; also I think
> it would be easier to implement than type level specialization).
>
> + egon

How do you do the following pattern with specializable packages ?
(suppose you have a generic package providing a Set data structure).

type Server struct {
Clients Set<Client>
}

type Client struct {
Services Set<Server>
}

I think providing a generic data structure without this ability
reduces its usability quite much.

Rémy.

egon

unread,
Mar 20, 2014, 3:46:38 AM3/20/14
to golan...@googlegroups.com, egon
import {
   clients "set" {Elem: Client}
   servers "set" {Elem: Server} 
}

type Server struct {
    Clients clients.Set
}

type Client struct {
    Services servers.Set
}

Of course this might cause some problems in the compiler due to the circular reference.


I think providing a generic data structure without this ability
reduces its usability quite much.

Yes, I would agree, this generics variant definitely doesn't fit all generics needs.

+ egon

 
 

Rémy.

Jérôme Champion

unread,
Mar 20, 2014, 5:00:00 AM3/20/14
to golan...@googlegroups.com, egon
I did a an experiment with annotation to generate specialized package based on the behavior you describe : https://github.com/champioj/geno ( I'm currently changing the annotation to be in the package comment section )
It work surprisingly well except for one drawback : circular reference. I'm not sure if it's possible to hack around this problem...or if any change to the language would be needed to help this case.

egon

unread,
Mar 20, 2014, 6:15:01 AM3/20/14
to golan...@googlegroups.com, egon


On Thursday, March 20, 2014 11:00:00 AM UTC+2, Jérôme Champion wrote:
I did a an experiment with annotation to generate specialized package based on the behavior you describe : https://github.com/champioj/geno ( I'm currently changing the annotation to be in the package comment section )
It work surprisingly well except for one drawback : circular reference. I'm not sure if it's possible to hack around this problem...or if any change to the language would be needed to help this case.

Adding circular references also for packages would be a bad idea, so an internal compiler change would be required.

The automatic boxing approach should work with the current compiler, but could cause similar problems as Java-s autoboxing. (Idea is that whenever you reference "Elem" you cast it to the appropriate type; also it would require some additional type-checking code.)

+ egon
Message has been deleted
Message has been deleted
Message has been deleted

Lars Seipel

unread,
Mar 23, 2014, 1:35:07 PM3/23/14
to liv...@gmail.com, golan...@googlegroups.com, egon
On Sat, Mar 22, 2014 at 01:36:12PM -0700, liv...@gmail.com wrote:
> It seems you are interested in experiments. Have a look at my approach to
> generics https://github.com/leeview/godsl

Do you also offer weight loss pills and cheap loans without credit
check?

Now seriously, while I appreciate your enthusiasm, sending your project
link to to the list multiple times within just a few minutes doesn't
leave the impression you probably intended. It appears rude and spammy.

Lars
Reply all
Reply to author
Forward
0 new messages