Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Implementing Generics

7,066 views
Skip to first unread message

Keith Randall

unread,
Sep 17, 2020, 6:10:14 PM9/17/20
to golang-dev
The Go generics proposal is working its way through the proposal process. It's a very complicated proposal, which begs the question:

    If the generics proposal is accepted, how are we going to implement it?

The Go team at Google has put some thought into this question all along, but we think now is the time to start thinking more seriously and widely about it. Hence this email. Note that the generics proposal is not accepted yet, so we'll have to discuss implementations without having a final accepted proposal in hand. Nevertheless, the outline of the proposal is pretty clear and provides a sufficient basis to consider implementation strategies, even if we might have to revise those strategies if changes to the proposal are forthcoming.

Of course, if the proposal is declined we'll have wasted some work. If it is accepted in some form, however, we'll hopefully have accelerated the progress to an implementation.

I've written up two design docs containing two possible strategies for implementing generics. They are:
  1. Stenciling. This strategy generates a separate implementation of a generic function for every instantiation (set of provided type arguments). This is more-or-less how C++ does generics.
  2. Dictionaries. This strategy generates a single implementation of a generic function which can handle all possible type arguments. This is kind-of how Java does generics, although the analogy is weaker than the stenciling-C++ one.
There may be other implementation strategies (including hybrids of the two above); these two are just to set the ball rolling on discussion.

What do people think? Which strategy do you like? What glaring showstopper are we forgetting?

At a high level, stenciling allows for better runtime performance, at the cost of larger binaries and slower compile times. Stenciling is also somewhat simpler to implement in the compiler. But those are generalizations and depend a lot on the code being compiled and the various mitigations (discussed in the design docs) that could be brought to bear.

There are also lots of other smaller decisions we'll need to make about generics implementation. How do we modify the compiler's parser? Its internal type representation? What in the runtime needs to understand generics? All of these are also fair game for this discussion.

And on a final note, I'd like to brainstorm about prep work we can do now that will make generics easier if/when the proposal gets accepted. We're obviously not going to have generics in for 1.16, but we might want to do some prep work in 1.16 for pieces we know we'll need, or think we'll need, for an eventual generics implementation. For example, I've been working on a prototype of variable-sized frames which would be useful for the dictionary proposal above, but might also be useful for other things, like reflect.Call. Another idea is cleaning up the compiler organization to make it easier to plug in a different typechecker.

Process notes:

- Let's start discussing the topics outlined above in this email thread. If the discussion gets verbose enough, we might want to move it somewhere else (e.g. a reddit thread). We'll cross that bridge when/if we get to it.
- If you have specific edits to the docs linked above, you can mail me a CL.
- This discussion is about implementation of the generics proposal, not the generics proposal itself. The line is a bit fuzzy, but if you're unsure you might consider discussing the generics proposal (currently happening on golang-nuts, search for the [generics] subject line) instead of posting here.

Brad Fitzpatrick

unread,
Sep 17, 2020, 7:46:52 PM9/17/20
to Keith Randall, golang-dev
I thought it was already said somewhere that there'd be no point to this whole generics proposal if the implementation carried a runtime cost. I don't think it'll be used as much if it's slow or GC heavy at runtime (like how nobody uses the container packages in std)

I'd hope for a hybrid: stenciling but sharing instantiations when they had the same GC bitmap shape.


--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/golang-dev/CAB%3Df9xT9%2BYHA5qT2GmUAoQ-ne_a8RWg4uRKDXwSgdn0M5RxdTA%40mail.gmail.com.

Ian Lance Taylor

unread,
Sep 17, 2020, 8:15:02 PM9/17/20
to Brad Fitzpatrick, Keith Randall, golang-dev
On Thu, Sep 17, 2020 at 4:46 PM Brad Fitzpatrick <brad...@golang.org> wrote:
>
> I thought it was already said somewhere that there'd be no point to this whole generics proposal if the implementation carried a runtime cost. I don't think it'll be used as much if it's slow or GC heavy at runtime (like how nobody uses the container packages in std)

I suspect (with zero actual evidence) that for most generic functions
a dictionary implementation would carry overhead similar to using
interface types today. I think most Go programmers would find that
acceptable.

I don't think we should require that generics carries zero runtime
cost compared to writing out the code multiple times with different
types. That is, I don't think we should set zero abstraction penalty
as a goal. Perhaps we can get there, but I think that making that a
requirement will set us up for failure. People who truly need that
performance can already use code generation. Generics has to have
very good performance; it's not obvious to me that it has to have
perfect performance.

(That said, to be clear, I think that the memory performance of a
generic data structure has to be perfect. I think it's important that
people be able to easily understand the memory requirements of their
generic data structures.
https://go.googlesource.com/proposal/+/refs/heads/master/design/go2draft-type-parameters.md#values-of-type-parameters-are-not-boxed)

Ian

Austin Clements

unread,
Sep 17, 2020, 8:16:28 PM9/17/20
to Keith Randall, golang-dev
One concern with a pure stenciling approach is that the costs of it won't be apparent immediately. The costs in terms of build time, binary size, and the indirect impact that binary size has on performance will scale with its usage, so we may not see them until it's so widely adopted that it's hard to change course.

I think the best long-term answer is a hybrid solution, ideally one that chooses between stenciling and dictionaries using profile information. To me, that leaves the question of the best path to get there, and I argue that because the costs of stenciling are hidden, it's better to start with the dictionary approach and view stenciling as an optimization. This will also ensure we don't accidentally back ourselves into a design that doesn't support the dictionary approach (it's also possible to back ourselves into a design that doesn't support stenciling, but I think we understand those limitations much better).

To Brad's point, dictionary-passing probably won't perform as well as stenciling, though we shouldn't discount the impact of stenciling on TLB and cache pressure. But it's not fair to compare it to the container packages both because it will almost certainly perform much better than the container packages, and because the other major limitation of the container packages is usability, which any implementation of generics will address.


Austin Clements

unread,
Sep 17, 2020, 8:24:14 PM9/17/20
to Keith Randall, golang-dev
On Thu, Sep 17, 2020 at 8:16 PM Austin Clements <aus...@google.com> wrote:
But it's not fair to compare it to the container packages both because it will almost certainly perform much better than the container packages, and because the other major limitation of the container packages is usability, which any implementation of generics will address.

I should clarify why I think it will perform better (partly because Ian said it will carry similar overhead about 60 seconds before I sent this claim :). Because the current containers (at least, the ones no one uses) use interface{}s in the data structures themselves, they force a level of indirection. Both stenciling and dictionary-passing will let us embed values directly in container types, significantly reducing the memory and GC overhead for storing scalar values (and slightly reducing it when storing pointer values by at least eliminating the "type" field implicit in the interface value). I think the container methods themselves would perform similarly to the interface{}-based ones. It will also save on the costs of type assertions for users of the containers, though that cost is probably small right now.

Robert Engels

unread,
Sep 17, 2020, 8:53:37 PM9/17/20
to Austin Clements, Keith Randall, golang-dev
What stops doing both? A compile time flag or even based on thresholds of generic types used. Similar to how a compiler decides on inlining. 

On Sep 17, 2020, at 7:24 PM, 'Austin Clements' via golang-dev <golan...@googlegroups.com> wrote:


On Thu, Sep 17, 2020 at 8:16 PM Austin Clements <aus...@google.com> wrote:
But it's not fair to compare it to the container packages both because it will almost certainly perform much better than the container packages, and because the other major limitation of the container packages is usability, which any implementation of generics will address.

I should clarify why I think it will perform better (partly because Ian said it will carry similar overhead about 60 seconds before I sent this claim :). Because the current containers (at least, the ones no one uses) use interface{}s in the data structures themselves, they force a level of indirection. Both stenciling and dictionary-passing will let us embed values directly in container types, significantly reducing the memory and GC overhead for storing scalar values (and slightly reducing it when storing pointer values by at least eliminating the "type" field implicit in the interface value). I think the container methods themselves would perform similarly to the interface{}-based ones. It will also save on the costs of type assertions for users of the containers, though that cost is probably small right now.

--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.

Keith Randall

unread,
Sep 17, 2020, 10:10:04 PM9/17/20
to Robert Engels, Austin Clements, golang-dev
On Thu, Sep 17, 2020 at 5:53 PM Robert Engels <ren...@ix.netcom.com> wrote:
What stops doing both? A compile time flag or even based on thresholds of generic types used. Similar to how a compiler decides on inlining. 


I think the main barrier here is complexity. It's going to take a lot of work to implement either scheme. Multiply that by 2 if we're going to do both.

roger peppe

unread,
Sep 18, 2020, 7:51:44 AM9/18/20
to Brad Fitzpatrick, Keith Randall, golang-dev
On Fri, 18 Sep 2020 at 00:46, Brad Fitzpatrick <brad...@golang.org> wrote:
I thought it was already said somewhere that there'd be no point to this whole generics proposal if the implementation carried a runtime cost. I don't think it'll be used as much if it's slow or GC heavy at runtime (like how nobody uses the container packages in std)

I'd hope for a hybrid: stenciling but sharing instantiations when they had the same GC bitmap shape.

This is what I'd expect too. It'll still need a dictionary so that you can invoke generic operations (there would
be a runtime cost similar to the runtime cost of invoking methods on an interface because method calls would
be indirect), but at least you can copy values around and look up items on the stack without extra cost.

In general, you can share code between implementations that have the same alignment and GC bitmap shape
without changing the code generation for function bodies much. I actually prototyped this as an idea some time ago using the current compiler
and writing the instantiated code as if it had been generated automatically. I just updated it to use the latest
syntax - it's at https://github.com/rogpeppe/genericdemo and contains a working (well, it's lightly tested at least)
version of the graph example from the original generics draft protocol.

In that idea, the "stencil" part involves writing a top-level shim that invokes the generic function under the hood,
passing it the dictionary containing all the necessary functions. I prototyped both GC-bitmap-based sharing
and also a reflect-based generic function that can deal with any type. The latter could obviously be a lot more
efficient if the compiler was doing it, but it would still inevitably be quite a lot slower. Consider this generic function, for example:

    func F[A, B, C any](f func()[]struct{a A, b B, c C}, i int) C {
        return f()[i].c
    }

For a stencilled function, this function can be implemented with function call, a range check and a constant-offset indirect
access. For a purely generic function, it'll also need to do a multiply and an add of indirect values in order to find the
right member. This applies to stack accesses too. That's really going to hurt in tight loops, I think.

On a slightly different subject: not that generic methods or function values are proposed, but if they
were, what about the possibility of allowing generic methods in interfaces and function values, but
let them be instantiated only with pointer, string, slice or interface-shaped objects? That way, for those kinds of type
you get similar performance for generic interface method or function calls as for the rest of the bitmap-shape-shared
generic functions - the compiler prevents you seeing a sharp dropoff in performance by using a type parameter
that's not efficiently implemented. You get a code-expansion factor of 4, but no more (and you can probably
limit that by producing that code only for functions/methods that are assigned to a generic function value
or interface with a generic method.

Something that I haven't seen talked about much yet is how to deal with the requirement
that the instantiations for a generic type must be seen before you can generate the code for that 
type. This surely has implications for module compilation: would the body of every generic method and
function need to be exported as source like potentially inlined functions are exported today, or would there
be some kind of intermediate language involved? (SSA?)

   cheers,
    rog.

aind...@gmail.com

unread,
Sep 18, 2020, 9:29:00 AM9/18/20
to golang-dev
I believe Swift uses a “dictionary-passing” approach in its implementation of generics. There's a talk on it here: https://youtu.be/ctS8FzqcRug
They have a type system feature called "protocols," which are similar to interfaces. They use witness tables to capture the allocation, copying, and destruction of a value, as well as conform to its protocol. This enables them to do monomorphization as an optimization, which is what I think is being discussed here.

Russ Cox

unread,
Sep 18, 2020, 11:36:33 AM9/18/20
to aind...@gmail.com, golang-dev
On Fri, Sep 18, 2020 at 9:29 AM aind...@gmail.com <aind...@gmail.com> wrote:
I believe Swift uses a “dictionary-passing” approach in its implementation of generics. There's a talk on it here: https://youtu.be/ctS8FzqcRug
They have a type system feature called "protocols," which are similar to interfaces. They use witness tables to capture the allocation, copying, and destruction of a value, as well as conform to its protocol.

So that no one has to spend the time I did trying to figure this out long ago: "witness table" is just the Swift word for a Go itable/C++ vtable.

Best,
Russ

Tommie Gannert

unread,
Sep 18, 2020, 11:43:19 AM9/18/20
to golang-dev
On Friday, 18 September 2020 at 00:10:14 UTC+2 Keith Randall wrote:
The Go generics proposal is working its way through the proposal process. It's a very complicated proposal, which begs the question:

    If the generics proposal is accepted, how are we going to implement it?

The Go team at Google has put some thought into this question all along, but we think now is the time to start thinking more seriously and widely about it. Hence this email. Note that the generics proposal is not accepted yet, so we'll have to discuss implementations without having a final accepted proposal in hand. Nevertheless, the outline of the proposal is pretty clear and provides a sufficient basis to consider implementation strategies, even if we might have to revise those strategies if changes to the proposal are forthcoming.

Of course, if the proposal is declined we'll have wasted some work. If it is accepted in some form, however, we'll hopefully have accelerated the progress to an implementation.

I've written up two design docs containing two possible strategies for implementing generics. They are:
  1. Stenciling. This strategy generates a separate implementation of a generic function for every instantiation (set of provided type arguments). This is more-or-less how C++ does generics.
  2. Dictionaries. This strategy generates a single implementation of a generic function which can handle all possible type arguments. This is kind-of how Java does generics, although the analogy is weaker than the stenciling-C++ one.
There may be other implementation strategies (including hybrids of the two above); these two are just to set the ball rolling on discussion. 

What do people think? Which strategy do you like? What glaring showstopper are we forgetting?

The way I'd write this today is using type switches over interface{} where I need to narrow the type. If the compiler rewrote to that, it would be "zero surprises" for me in terms of performance, so I'm wondering if that was considered before the Dictionaries suggestion? I could imagine many functions just plumb through to others without narrowing types, making dictionaries add overhead (memory and CPU) where it would be surprising to me.

Keith Randall

unread,
Sep 18, 2020, 12:17:44 PM9/18/20
to Tommie Gannert, golang-dev
On Fri, Sep 18, 2020 at 8:43 AM Tommie Gannert <tgan...@gmail.com> wrote:
On Friday, 18 September 2020 at 00:10:14 UTC+2 Keith Randall wrote:
The Go generics proposal is working its way through the proposal process. It's a very complicated proposal, which begs the question:

    If the generics proposal is accepted, how are we going to implement it?

The Go team at Google has put some thought into this question all along, but we think now is the time to start thinking more seriously and widely about it. Hence this email. Note that the generics proposal is not accepted yet, so we'll have to discuss implementations without having a final accepted proposal in hand. Nevertheless, the outline of the proposal is pretty clear and provides a sufficient basis to consider implementation strategies, even if we might have to revise those strategies if changes to the proposal are forthcoming.

Of course, if the proposal is declined we'll have wasted some work. If it is accepted in some form, however, we'll hopefully have accelerated the progress to an implementation.

I've written up two design docs containing two possible strategies for implementing generics. They are:
  1. Stenciling. This strategy generates a separate implementation of a generic function for every instantiation (set of provided type arguments). This is more-or-less how C++ does generics.
  2. Dictionaries. This strategy generates a single implementation of a generic function which can handle all possible type arguments. This is kind-of how Java does generics, although the analogy is weaker than the stenciling-C++ one.
There may be other implementation strategies (including hybrids of the two above); these two are just to set the ball rolling on discussion. 

What do people think? Which strategy do you like? What glaring showstopper are we forgetting?

The way I'd write this today is using type switches over interface{} where I need to narrow the type. If the compiler rewrote to that, it would be "zero surprises" for me in terms of performance, so I'm wondering if that was considered before the Dictionaries suggestion? I could imagine many functions just plumb through to others without narrowing types, making dictionaries add overhead (memory and CPU) where it would be surprising to me.
 

I'm not entirely sure what you are suggesting. Do you mean that we type-erase, i.e. replace every occurrence of a generic type T with interface{}, then add casts in the appropriate places?

I don't think this entirely works. What about:

func index[T any](a []T, i int) T {
    return a[i]
}
r := index[int64]([]int64{1,2,3}, 1)

We can't just treat []T as []interface{}, because int and interface{} have different sizes, so the indexing computation required is type-dependent.
Same for the return value. The caller is expecting an 8-byte return value, not a 16-byte interface that it needs to cast. Especially if you do:

f := index[int64]
r := f([]int64{1,2,3}, 1)

Maybe we could introduce wrappers functions for situations like this?


At a high level, stenciling allows for better runtime performance, at the cost of larger binaries and slower compile times. Stenciling is also somewhat simpler to implement in the compiler. But those are generalizations and depend a lot on the code being compiled and the various mitigations (discussed in the design docs) that could be brought to bear.

There are also lots of other smaller decisions we'll need to make about generics implementation. How do we modify the compiler's parser? Its internal type representation? What in the runtime needs to understand generics? All of these are also fair game for this discussion.

And on a final note, I'd like to brainstorm about prep work we can do now that will make generics easier if/when the proposal gets accepted. We're obviously not going to have generics in for 1.16, but we might want to do some prep work in 1.16 for pieces we know we'll need, or think we'll need, for an eventual generics implementation. For example, I've been working on a prototype of variable-sized frames which would be useful for the dictionary proposal above, but might also be useful for other things, like reflect.Call. Another idea is cleaning up the compiler organization to make it easier to plug in a different typechecker.

Process notes:

- Let's start discussing the topics outlined above in this email thread. If the discussion gets verbose enough, we might want to move it somewhere else (e.g. a reddit thread). We'll cross that bridge when/if we get to it.
- If you have specific edits to the docs linked above, you can mail me a CL.
- This discussion is about implementation of the generics proposal, not the generics proposal itself. The line is a bit fuzzy, but if you're unsure you might consider discussing the generics proposal (currently happening on golang-nuts, search for the [generics] subject line) instead of posting here.

--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.

Keith Randall

unread,
Sep 18, 2020, 12:24:07 PM9/18/20
to roger peppe, Brad Fitzpatrick, golang-dev
On Fri, Sep 18, 2020 at 4:51 AM roger peppe <rogp...@gmail.com> wrote:
On Fri, 18 Sep 2020 at 00:46, Brad Fitzpatrick <brad...@golang.org> wrote:
I thought it was already said somewhere that there'd be no point to this whole generics proposal if the implementation carried a runtime cost. I don't think it'll be used as much if it's slow or GC heavy at runtime (like how nobody uses the container packages in std)

I'd hope for a hybrid: stenciling but sharing instantiations when they had the same GC bitmap shape.

This is what I'd expect too. It'll still need a dictionary so that you can invoke generic operations (there would
be a runtime cost similar to the runtime cost of invoking methods on an interface because method calls would
be indirect), but at least you can copy values around and look up items on the stack without extra cost.

In general, you can share code between implementations that have the same alignment and GC bitmap shape
without changing the code generation for function bodies much. I actually prototyped this as an idea some time ago using the current compiler
and writing the instantiated code as if it had been generated automatically. I just updated it to use the latest
syntax - it's at https://github.com/rogpeppe/genericdemo and contains a working (well, it's lightly tested at least)
version of the graph example from the original generics draft protocol.


There's some discussion in the stenciling doc about deduplicating implementations, which I think would capture the case of two types that have identical GC bitmap shapes. Do you think that would be sufficient, or not?

roger peppe

unread,
Sep 18, 2020, 1:34:06 PM9/18/20
to Keith Randall, Brad Fitzpatrick, golang-dev
Forgive me, I hadn't observed that the email starting this thread has links to more substantial documents. I'll read those pronto!

roger peppe

unread,
Sep 18, 2020, 1:54:01 PM9/18/20
to Keith Randall, Brad Fitzpatrick, golang-dev
On Fri, 18 Sep 2020, 17:23 Keith Randall, <k...@google.com> wrote:


On Fri, Sep 18, 2020 at 4:51 AM roger peppe <rogp...@gmail.com> wrote:
On Fri, 18 Sep 2020 at 00:46, Brad Fitzpatrick <brad...@golang.org> wrote:
I thought it was already said somewhere that there'd be no point to this whole generics proposal if the implementation carried a runtime cost. I don't think it'll be used as much if it's slow or GC heavy at runtime (like how nobody uses the container packages in std)

I'd hope for a hybrid: stenciling but sharing instantiations when they had the same GC bitmap shape.

This is what I'd expect too. It'll still need a dictionary so that you can invoke generic operations (there would
be a runtime cost similar to the runtime cost of invoking methods on an interface because method calls would
be indirect), but at least you can copy values around and look up items on the stack without extra cost.

In general, you can share code between implementations that have the same alignment and GC bitmap shape
without changing the code generation for function bodies much. I actually prototyped this as an idea some time ago using the current compiler
and writing the instantiated code as if it had been generated automatically. I just updated it to use the latest
syntax - it's at https://github.com/rogpeppe/genericdemo and contains a working (well, it's lightly tested at least)
version of the graph example from the original generics draft protocol.


There's some discussion in the stenciling doc about deduplicating implementations, which I think would capture the case of two types that have identical GC bitmap shapes. Do you think that would be sufficient, or not?

After a brief scan, no I don't think that would be sufficient in general for anything but "any" type parameters. The moment you have a shared implementation and available operations on the type, you will end up needing a dictionary AFAICS.

It's not such an involved dictionary as is needed for the wholly generic implementation, I think, because you don't need to pass in the size information that's implicit in the instantiated code, and less runtime support is needed because all frame sizes etc are known statically.

I suspect that any realistic implementation that manages to avoid C++-style code blowup while maintaining reasonable performance will need to be a hybrid of both approaches.

  cheers,
    rog.

Tommie Gannert

unread,
Sep 18, 2020, 5:32:30 PM9/18/20
to golang-dev
On Friday, 18 September 2020 at 18:17:44 UTC+2 Keith Randall wrote:
On Fri, Sep 18, 2020 at 8:43 AM Tommie Gannert <tgan...@gmail.com> wrote:
On Friday, 18 September 2020 at 00:10:14 UTC+2 Keith Randall wrote:
The Go generics proposal is working its way through the proposal process. It's a very complicated proposal, which begs the question:

    If the generics proposal is accepted, how are we going to implement it?

The Go team at Google has put some thought into this question all along, but we think now is the time to start thinking more seriously and widely about it. Hence this email. Note that the generics proposal is not accepted yet, so we'll have to discuss implementations without having a final accepted proposal in hand. Nevertheless, the outline of the proposal is pretty clear and provides a sufficient basis to consider implementation strategies, even if we might have to revise those strategies if changes to the proposal are forthcoming.

Of course, if the proposal is declined we'll have wasted some work. If it is accepted in some form, however, we'll hopefully have accelerated the progress to an implementation.

I've written up two design docs containing two possible strategies for implementing generics. They are:
  1. Stenciling. This strategy generates a separate implementation of a generic function for every instantiation (set of provided type arguments). This is more-or-less how C++ does generics.
  2. Dictionaries. This strategy generates a single implementation of a generic function which can handle all possible type arguments. This is kind-of how Java does generics, although the analogy is weaker than the stenciling-C++ one.
There may be other implementation strategies (including hybrids of the two above); these two are just to set the ball rolling on discussion. 

What do people think? Which strategy do you like? What glaring showstopper are we forgetting?

The way I'd write this today is using type switches over interface{} where I need to narrow the type. If the compiler rewrote to that, it would be "zero surprises" for me in terms of performance, so I'm wondering if that was considered before the Dictionaries suggestion? I could imagine many functions just plumb through to others without narrowing types, making dictionaries add overhead (memory and CPU) where it would be surprising to me.
 

I'm not entirely sure what you are suggesting. Do you mean that we type-erase, i.e. replace every occurrence of a generic type T with interface{}, then add casts in the appropriate places?

I don't think this entirely works. What about:

func index[T any](a []T, i int) T {
    return a[i]
}
r := index[int64]([]int64{1,2,3}, 1)

We can't just treat []T as []interface{}, because int and interface{} have different sizes, so the indexing computation required is type-dependent.

I was thinking that a is interface{}, not []interface{}. That's how we're doing it now, e.g. in sort.Slice.
 
Same for the return value. The caller is expecting an 8-byte return value, not a 16-byte interface that it needs to cast. Especially if you do:

f := index[int64]
r := f([]int64{1,2,3}, 1)

Maybe we could introduce wrappers functions for situations like this?

Yeah, that might be a useful hybrid to avoid affecting code generated at call sites. Assuming arguments are interface{} and type-switched, the wrapper only needs to deal with return value(s), and will probably be short. So even if there are many combinations of return types in a full binary, the wrappers might not cost that much.

Bakul Shah

unread,
Sep 19, 2020, 1:07:13 AM9/19/20
to Keith Randall, golang-dev
Are there any stats on how many different generic types/functions are used in Java / C++ programs? I am wondering about how much is "larger binaries" a real problem vs imagined.

I too feel that stenciling would be *simpler* to implement. It is not clear to me if slow C++ compile time issues apply here. Presumably each unique concrete type or functions constructed out of generics can be cached?

--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.

roger peppe

unread,
Sep 19, 2020, 3:25:46 AM9/19/20
to Bakul Shah, Keith Randall, golang-dev


On Sat, 19 Sep 2020, 06:07 Bakul Shah, <ba...@iitbombay.org> wrote:
I too feel that stenciling would be *simpler* to implement. It is not clear to me if slow C++ compile time issues apply here. Presumably each unique concrete type or functions constructed out of generics can be cached?

What do you mean by that?

minux

unread,
Sep 19, 2020, 4:45:23 AM9/19/20
to roger peppe, Bakul Shah, Keith Randall, golang-dev
On Sat, Sep 19, 2020 at 3:25 AM roger peppe <rogp...@gmail.com> wrote:
On Sat, 19 Sep 2020, 06:07 Bakul Shah, <ba...@iitbombay.org> wrote:
I too feel that stenciling would be *simpler* to implement. It is not clear to me if slow C++ compile time issues apply here. Presumably each unique concrete type or functions constructed out of generics can be cached?

What do you mean by that?

I think it means that we can make our go build cache caching not only objects built from whole packages,
but also unique instantiations of generic functions and methods.

In fact, while I thought about this, we probably can split the objects into two parts: non-generic part and
generic instantiations, while the latter will not be saved in the package object file, but as standalone
object files that will be pulled in as necessary by the linker. This avoids the other code bloat problem
of C++ templates, where the C++ compiler must be conservative and generate implementation
for all used template instantiations in the output object file for each source file and relying on the linker
to dedup them. As the Go compiler knows about the cache, it can check to see if the implementation
already exists before recreating it.

Of course, we still need to be able to inline generic functions/methods, so only those parts that can't
be inlined should be stored separately in the build cache.

Furthermore, such a caching system might be able to support a more generic form of deduplication
by making it a content addressable cache keyed by the object code & its distinguishing metadata
(e.g. for structs, only the types of fields are necessary part of the key, the name of the fields can be
discarded.)

This generalized deduplication is not limited to generic code though, if we split the package object
files up into meaningful smaller parts that are separately cached.

Bakul Shah

unread,
Sep 19, 2020, 11:00:56 AM9/19/20
to minux, roger peppe, Keith Randall, golang-dev
On Sep 19, 2020, at 1:45 AM, minux <mi...@golang.org> wrote:


On Sat, Sep 19, 2020 at 3:25 AM roger peppe <rogp...@gmail.com> wrote:
On Sat, 19 Sep 2020, 06:07 Bakul Shah, <ba...@iitbombay.org> wrote:
I too feel that stenciling would be *simpler* to implement. It is not clear to me if slow C++ compile time issues apply here. Presumably each unique concrete type or functions constructed out of generics can be cached?

What do you mean by that?

Minux explains it much better than I could've!


I think it means that we can make our go build cache caching not only objects built from whole packages,
but also unique instantiations of generic functions and methods.

In fact, while I thought about this, we probably can split the objects into two parts: non-generic part and
generic instantiations, while the latter will not be saved in the package object file, but as standalone
object files that will be pulled in as necessary by the linker. This avoids the other code bloat problem
of C++ templates, where the C++ compiler must be conservative and generate implementation
for all used template instantiations in the output object file for each source file and relying on the linker
to dedup them. As the Go compiler knows about the cache, it can check to see if the implementation
already exists before recreating it.

This is one of the reasons I favored generic packages (back in 2014). Since genericity in Go
is purely a compile time issue, conceptually you can just create a subpackage for each unique
concretized package. A preprocessor could've done so. Then the preprocessor writes the code
instead of a human. Makes it easier to grasp the additional compile time cost of generics.

In the current generic proposal, caching would serve a similar purpose. If one were to implement
generics using a preprocessor, I think it can write separate files for each unique combination in
the same package directory (and remove the generic code).


Of course, we still need to be able to inline generic functions/methods, so only those parts that can't
be inlined should be stored separately in the build cache.

Furthermore, such a caching system might be able to support a more generic form of deduplication
by making it a content addressable cache keyed by the object code & its distinguishing metadata
(e.g. for structs, only the types of fields are necessary part of the key, the name of the fields can be
discarded.)

This generalized deduplication is not limited to generic code though, if we split the package object
files up into meaningful smaller parts that are separately cached.

A fine grained ccache :-)

афывафывавы ывафывафыва

unread,
Sep 19, 2020, 2:43:54 PM9/19/20
to golang-dev
My thoughts on it

  • Generic stuff is usually simple and will not weight a lot in a majority of cases with stenciling.
  • Many projects use code generation for their containers and logic, not "generic" types with interface{}, so stenciling would be an automation for what they are already doing.
  • Java can afford the second approach as (at least in theory) they have JIT which may turn initial dictionary-based code into an efficient one.

пятница, 18 сентября 2020 г. в 01:10:14 UTC+3, Keith Randall:

aind...@gmail.com

unread,
Sep 19, 2020, 3:27:54 PM9/19/20
to golang-dev
> My thoughts on it

We should not assume that if generics land, that they will only be used as a
replacement for current uses of code generation or type assertions. Even a
few higher-order functions could cause a binary size and compile-time blowup
if they are used everywhere. Things like map/filter/fold, channel manipulation
functions, or error handling helpers.

Денис Черемисов

unread,
Sep 20, 2020, 1:51:01 AM9/20/20
to golang-dev
> Things like map/filter/fold, channel manipulation
functions, or error handling helpers.

Didn't though about it indeed. Then the combined approach
seems reasonable:

1) Generic types and their methods are via stenciling.
2) Generic functions with type list constraints are via stenciling too.
3) Generic functions with "generic" constraints are dictionary based.

This seems mostly OK for our needs (can't say for everyone of course):

* We uses some generated types in hot loops and it would be a bit undesirable 
   to have additional overhead and GC pressure on them. And for some of them
   it would not be just "a bit".
* We mostly want generics functions for error handling stuff, channel helpers
   and one thing: handling gRPC bistreams. An overhead will be negligible compared
   to their workload (network interactions, even looping, reflect usage, etc)



суббота, 19 сентября 2020 г. в 22:27:54 UTC+3, aind...@gmail.com:

minux

unread,
Sep 20, 2020, 3:11:30 AM9/20/20
to aind...@gmail.com, Денис Черемисов, golang-dev
On Sat, Sep 19, 2020 at 3:28 PM aind...@gmail.com <aind...@gmail.com> wrote:
> My thoughts on it

We should not assume that if generics land, that they will only be used as a
replacement for current uses of code generation or type assertions. Even a
few higher-order functions could cause a binary size and compile-time blowup
if they are used everywhere. Things like map/filter/fold, channel manipulation
functions, or error handling helpers.

The examples you gave (map/filter/fold and channel manipulation functions) are
exactly what I think should not only be stenciled, but also aggressively inlined.

e.g. a map function implemented as a dictionary is not as good as just inlining
the for loop yourself, and frankly, it's not that better than writing individual map
functions.

On Sun, Sep 20, 2020 at 1:50 AM Денис Черемисов <denis.ch...@gmail.com> wrote:
> Things like map/filter/fold, channel manipulation
functions, or error handling helpers.

Didn't though about it indeed. Then the combined approach
seems reasonable:

1) Generic types and their methods are via stenciling.
2) Generic functions with type list constraints are via stenciling too.
3) Generic functions with "generic" constraints are dictionary based.

This seems mostly OK for our needs (can't say for everyone of course):

For 3), isn't that we can already implement as functions taking interface values?

I think it's perhaps better to focus on a concrete example instead of talking only about abstract
use cases. I think it's without question that small generic functions must be inlined (e.g. max/min),
so let's choose a bigger example. If we have an equivalent generic sort.Slice function available,
do you want it to be stenciled or dictionary based? It does involve quite a bit of code, and it's also
one of the cases where C++ outperforms equivalent C code because of the inlined code.

My thought is that, if the user does not want the code bloat, then they can always write it in an
interface based implementation as they are today. In fact, if we allow generic methods to implement
interfaces, we can also drastically decrease the amount of boilerplate code necessary for these use
cases.

roger peppe

unread,
Sep 21, 2020, 11:33:07 AM9/21/20
to Keith Randall, golang-dev
You mention recursive generic functions. This can become quite pathological if we're not careful about the rules.
For example, here's a short program that generates over 300k different generic functions: https://go2goplay.golang.org/p/GjBIOAYc_uu


Bakul Shah

unread,
Sep 21, 2020, 12:13:09 PM9/21/20
to roger peppe, Keith Randall, golang-dev
On Sep 21, 2020, at 8:32 AM, roger peppe <rogp...@gmail.com> wrote:
>
> You mention recursive generic functions. This can become quite pathological if we're not careful about the rules.
> For example, here's a short program that generates over 300k different generic functions: https://go2goplay.golang.org/p/GjBIOAYc_uu

The reason I asked for stats on generics use in Java and other
languages was to get an idea of how they are used. If the kind
of example you wrote is common, then it would make sense to
think of optimizing it. But I very much doubt it is.
Don't let the perfect be the enemy of good!

roger peppe

unread,
Sep 21, 2020, 12:29:42 PM9/21/20
to Bakul Shah, Keith Randall, golang-dev
I'm thinking of programs like the above as a kind of DoS attack. If it's easy to make a program with super-exponential compile time, that
could be a significant issue.

Robert Engels

unread,
Sep 21, 2020, 1:44:14 PM9/21/20
to roger peppe, Bakul Shah, Keith Randall, golang-dev
With respect to Java - almost all usages are the standard collections classes. 


Which is why I said Go doesn’t need generics - only a more versatile set of collections structures/facilities. 

On Sep 21, 2020, at 11:29 AM, roger peppe <rogp...@gmail.com> wrote:


--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.

Robert Engels

unread,
Sep 21, 2020, 2:09:06 PM9/21/20
to burak serdar, roger peppe, Bakul Shah, Keith Randall, golang-dev
What are you referring to as “edge cases”?

> On Sep 21, 2020, at 1:01 PM, burak serdar <bse...@computer.org> wrote:
>
> On Mon, Sep 21, 2020 at 11:44 AM Robert Engels <ren...@ix.netcom.com> wrote:
>>
>> With respect to Java - almost all usages are the standard collections classes.
>>
>> See http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.296.6774&rep=rep1&type=pdf
>>
>> Which is why I said Go doesn’t need generics - only a more versatile set of collections structures/facilities.
>
> In my opinion, the reason Java generics are not much used for cases
> other than standard collections is that they are implemented with so
> many edge cases that using them for anything other than containers is
> almost impossible. I don't think Java is a good example to look at to
> get some insight on how generics are used.
>
>
>>
>> On Sep 21, 2020, at 11:29 AM, roger peppe <rogp...@gmail.com> wrote:
>>
>> 
>>> On Mon, 21 Sep 2020 at 17:12, Bakul Shah <ba...@iitbombay.org> wrote:
>>>
>>> On Sep 21, 2020, at 8:32 AM, roger peppe <rogp...@gmail.com> wrote:
>>>>
>>>> You mention recursive generic functions. This can become quite pathological if we're not careful about the rules.
>>>> For example, here's a short program that generates over 300k different generic functions: https://go2goplay.golang.org/p/GjBIOAYc_uu
>>>
>>> The reason I asked for stats on generics use in Java and other
>>> languages was to get an idea of how they are used. If the kind
>>> of example you wrote is common, then it would make sense to
>>> think of optimizing it. But I very much doubt it is.
>>> Don't let the perfect be the enemy of good!
>>
>>
>> I'm thinking of programs like the above as a kind of DoS attack. If it's easy to make a program with super-exponential compile time, that
>> could be a significant issue.
>>
>> --
>> You received this message because you are subscribed to the Google Groups "golang-dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-dev/CAJhgacj9Lyo1hJd6s1cnnxvXpbdQ-RafZSpBD%2B-ABgTU81Ymnw%40mail.gmail.com.
>>
>> --
>> You received this message because you are subscribed to the Google Groups "golang-dev" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-dev/AF337442-AFA9-4C8A-8FC1-BF73AFECA193%40ix.netcom.com.

burak serdar

unread,
Sep 21, 2020, 2:10:22 PM9/21/20
to Robert Engels, roger peppe, Bakul Shah, Keith Randall, golang-dev
On Mon, Sep 21, 2020 at 11:44 AM Robert Engels <ren...@ix.netcom.com> wrote:
>
> With respect to Java - almost all usages are the standard collections classes.
>
> See http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.296.6774&rep=rep1&type=pdf
>
> Which is why I said Go doesn’t need generics - only a more versatile set of collections structures/facilities.

In my opinion, the reason Java generics are not much used for cases
other than standard collections is that they are implemented with so
many edge cases that using them for anything other than containers is
almost impossible. I don't think Java is a good example to look at to
get some insight on how generics are used.


>
> On Sep 21, 2020, at 11:29 AM, roger peppe <rogp...@gmail.com> wrote:
>
> 
> On Mon, 21 Sep 2020 at 17:12, Bakul Shah <ba...@iitbombay.org> wrote:
>>
>> On Sep 21, 2020, at 8:32 AM, roger peppe <rogp...@gmail.com> wrote:
>> >
>> > You mention recursive generic functions. This can become quite pathological if we're not careful about the rules.
>> > For example, here's a short program that generates over 300k different generic functions: https://go2goplay.golang.org/p/GjBIOAYc_uu
>>
>> The reason I asked for stats on generics use in Java and other
>> languages was to get an idea of how they are used. If the kind
>> of example you wrote is common, then it would make sense to
>> think of optimizing it. But I very much doubt it is.
>> Don't let the perfect be the enemy of good!
>
>
> I'm thinking of programs like the above as a kind of DoS attack. If it's easy to make a program with super-exponential compile time, that
> could be a significant issue.
>
> --
> You received this message because you are subscribed to the Google Groups "golang-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-dev/CAJhgacj9Lyo1hJd6s1cnnxvXpbdQ-RafZSpBD%2B-ABgTU81Ymnw%40mail.gmail.com.
>
> --
> You received this message because you are subscribed to the Google Groups "golang-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-dev/AF337442-AFA9-4C8A-8FC1-BF73AFECA193%40ix.netcom.com.

burak serdar

unread,
Sep 21, 2020, 2:33:40 PM9/21/20
to Robert Engels, roger peppe, Bakul Shah, Keith Randall, golang-dev
On Mon, Sep 21, 2020 at 12:09 PM Robert Engels <ren...@ix.netcom.com> wrote:
>
> What are you referring to as “edge cases”?

Couple of things I can think of are not being able to use "new", and
problems with using arrays. The effects of type erasure, in other
words.
> To view this discussion on the web visit https://groups.google.com/d/msgid/golang-dev/236EDF19-850C-4AC9-94B0-D5245EBC8B6D%40ix.netcom.com.

Carla Pfaff

unread,
Sep 21, 2020, 2:55:32 PM9/21/20
to golang-dev
On Monday, 21 September 2020 at 17:33:07 UTC+2 rog wrote:
You mention recursive generic functions. This can become quite pathological if we're not careful about the rules.
For example, here's a short program that generates over 300k different generic functions: https://go2goplay.golang.org/p/GjBIOAYc_uu

A cautionary tale on monomorphization:

The recent Rust 1.46 release broke the compilation of Rust code in the Fuchsia project and many other projects, due to monomorphization blowup: https://github.com/rust-lang/rust/issues/54540#issuecomment-689163436

They had to increase the "type_length_limit", which limits the maximum number of type substitutions made when constructing a concrete type during monomorphization [1], to over 18 million.

Keith Randall

unread,
Sep 25, 2020, 11:59:03 AM9/25/20
to Carla Pfaff, golang-dev
> I'd hope for a hybrid: stenciling but sharing instantiations when they had the same GC bitmap shape.

> This is what I'd expect too. It'll still need a dictionary so that you can invoke generic operations (there would be a runtime cost similar to the runtime cost of invoking methods on an interface because method calls would be indirect), but at least you can copy values around and look up items on the stack without extra cost.

I've been thinking some more about this idea. Basically, we could have one implementation per GC shape, where GC shape includes the size, alignment, and pointer bitmap for each instantiated type.
The great thing about this idea is that a lot of the dictionary complexity goes away. Stack frames are constant sized, and all of the stack layout and pointer map sections of the dictionaries are no longer needed.
We'd still want other parts of the dictionaries, like the derived types and subdictionaries.

I'm not sure how much this would buy us, though. It would deduplicate the code for f[int64] and f[uint64], but not f[int64] and f[string]. The deduplication we get would probably be very similar to what we would get with full stenciling + linker deduplication of identical bodies. The only differing cases would be ones which did something "nonstandard", by which I mean creating a derived type or assigning a type parameter value to an interface{}. It's not clear to me that the additional complexity of runtime dictionaries is better than the additional complexity of linker deduplication.

> That said, to be clear, I think that the memory performance of a generic data structure has to be perfect.  I think it's important that people be able to easily understand the memory requirements of their generic data structures.

One place where this may not be true is escape analysis. Escape analysis without fully stenciling means that we have to make conservative assumptions about methods on generic types escaping their arguments. Not sure how often that would come up in practice, but it is a worry.

> Something that I haven't seen talked about much yet is how to deal with the requirement that the instantiations for a generic type must be seen before you can generate the code for that  type. This surely has implications for module compilation: would the body of every generic method and function need to be exported as source like potentially inlined functions are exported today, or would there be some kind of intermediate language involved? (SSA?)

Yes, we'll have to export full source code bodies for generic functions in object files (like we do for inlining).

> Are there any stats on how many different generic types/functions are used in Java / C++ programs? I am wondering about how much is "larger binaries" a real problem vs imagined.

This paper describes a mechanism to remove duplicate code in C++. They find about 6% duplication, which isn't huge. There may be some survivorship bias there though (maybe people already avoid large blowups when writing their code).