Extending the Go Map: Adding a derivative type, while maintaining the old semantics and syntax

716 views
Skip to first unread message

Kyle Stanly

unread,
Jun 14, 2016, 12:55:03 PM6/14/16
to golang-dev
The "CL", which I'll admit is not what everyone is probably expecting. Unfortunately, I cannot for the life of me figure out how to install Gerrit on a Cygwin machine (Using git-codereview on Cygwin gives an error when attempting to open a file using a Cygwin path, and using git-codereview with Command Prompt or Powershell fails to authenticate the SSH private key because it attempts to use a cygwin path.. go figure). The "CL" is an HTML file with it's coloration intact, so I hope this will do for now.

I will open up with a summary, in the case that some of you developers are too busy to read all of this.

Concurrent Map

I plan to add a new built-in type to Go (or at least propose it, once it gets finished) of which it's concurrency will be transparent. Meaning, the difference between these two statements...

m := make(map[int]int,0)

and

m := make(map[int]int,0,CONCURRENT)

will be the ONLY differences users will notice. This allows it to be backwards compatible, and hence maintain Go 1.0 compatibility. I'm not sure if I want to make it possible to cast between concurrent and non-concurrent maps, as while I've been reading the source code, I can definitely say it will not be an easy task. Hence for now I'm settling for a map that is compatible with normal map operations. 

What have I tried so far?

As can be seen in the "CL", I added a new Etype (TCMAP), Op (OTCMAP), map header (cmap), a reflective type (cmapType) and runtime type (cmaptype), as well as a compile-time type (CmapType). [And oh my god, why do the three distinct type have no further distinction besides which letters are capitalized? The names of the types are based on the original mapType/maptype/MapType respectively. Why not prefix it or something?] I've added quite a few helper methods, and make it "compatible" the normal map when it attempts to retrieve the Key and Value *Type's. I've also added specific cases where the original map, sometimes even adding my custom type to the case statement. So, I think I have reflective, compile, and run time types down.

What problem am I experiencing?

I think that before the typecheck of OMAKE->TMAP in typecheck.go, and evaluation of OMAKEMAP in walk.go, that the types of the map are already set, and in fact I'm almost certain because the issue is that it attempts to cast Type's (this is compile-time implementation of Type) Extra field, it is unable to cast it to my custom type, (*CMapType), as it is of type (*MapType). Now this happens before runtime, as it throws an error in the compile-time. Hence, I'm almost certain this is the case. 

I've tried explicitly changing the type of the node to TCMAP, which causes a panic on the attempted conversion. If I don't explicitly do so, it fails the "wantEtype" check.Hence, I need to know, where does it explicitly set the type for the node? Is it safe to override the Extra field in compile-time Type by replacing it with my own? I've tried doing so after I change Etype, but then I get issues where it pretty much has unresolved/non-substituted types. 

What do I want to know?

Where precisely does the Type for the node OMAKE and OMAKEMAP get set, and is it possible to change the Etype and Type of the node safely without messing everything up? Is the issue that I missed a step or two when I did this? Am I going the wrong way with this?

Potential alternative approaches?

Would it be possible to modify the normal hmap header and mapType/maptype/MapType to have a field which can be casted to different types to handle the different types of maps?

For example. if I added an empty interface, like such...

data interface{}

Where data can contain data of either the normal hmap or cmap, and then I won't have to worry about adding my own explicit type? This way it can also make type-conversions between both types easier as it uses the same header. The issue is this may break something else, and then I'd be back here again. It certainly would make things easier, all I'd have to do is add to hmap's flags the actual type. Would this be a better approach?

Matthew Dempsky

unread,
Jun 14, 2016, 1:18:39 PM6/14/16
to Kyle Stanly, golang-dev
On Tue, Jun 14, 2016 at 9:55 AM, Kyle Stanly <thei...@gmail.com> wrote:
Where precisely does the Type for the node OMAKE and OMAKEMAP get set,

In an expression like:

     make(map[int]string, 10)

we first recursively typecheck the argument "map[int]string":

        case OMAKE:
                [...]
                l := args[0]
                l = typecheck(l, Etype)
                t := l.Type

The "map[int]string" expression is represented as an OTMAP node, which becomes an OTYPE node representing a TMAP type after type checking.  OTYPE nodes have their Type field set to the Go type that they represent.  This is why the t.Etype switch will dispatch to case TMAP.

After the t.Etype switch (assuming we haven't returned early because of an error), n.Type is set to t:

                n.Type = t
                break OpSwitch

and is it possible to change the Etype and Type of the node safely without messing everything up?

It's not generally safe to mutate a Type value, because they might be referenced by other Types.  For example, if someone did:

    type M map[int]int
    type S []M
    var _ = make(M, 0, CONCURRENT)

you don't want to mutate M's Type directly, because it will also affect S.

It's okay in typecheck to assign a different *Type value to a Nod's Type field though.

Would it be possible to modify the normal hmap header and mapType/maptype/MapType to have a field which can be casted to different types to handle the different types of maps?

It probably is possible, but it might be somewhat involved.  E.g., you at least also need to update cmd/compile/internal/gc/reflect.go's representation of hmap.

Depending on how different your map needs to be, you could just use a flag bit in hmap's flags field to indicate concurrent vs non-concurrent.

Kyle Stanly

unread,
Jun 14, 2016, 1:19:20 PM6/14/16
to golang-dev
One quick question to add on to the above... was the map EVER planned on being overhauled or "extended"? It feels as if it 100% was not ever going to be touched again. Even if it was originally written in C, there could have been at least some abstraction designed, or at least over the (nearly 3) years, an interface for the map to implement to allow things like this in the future.

Ian Lance Taylor

unread,
Jun 14, 2016, 1:27:12 PM6/14/16
to Kyle Stanly, golang-dev
On Tue, Jun 14, 2016 at 10:19 AM, Kyle Stanly <thei...@gmail.com> wrote:
>
> One quick question to add on to the above... was the map EVER planned on
> being overhauled or "extended"? It feels as if it 100% was not ever going to
> be touched again. Even if it was originally written in C, there could have
> been at least some abstraction designed, or at least over the (nearly 3)
> years, an interface for the map to implement to allow things like this in
> the future.

The map implementation has already been completely rewritten once.

There have never been any plans to extend the map type, though. What
you are doing is an interesting exercise, but I think it's unlikely to
be added to the language. Much more likely to be used would a
concurrent map as a go-gettable package.

Ian

Kyle Stanly

unread,
Jun 14, 2016, 1:37:01 PM6/14/16
to golang-dev, thei...@gmail.com
It probably is possible, but it might be somewhat involved.  E.g., you at least also need to update cmd/compile/internal/gc/reflect.go's representation of hmap.

It's more involved than adding a new built-in type to Go?

 Depending on how different your map needs to be, you could just use a flag bit in hmap's flags field to indicate concurrent vs non-concurrent

The issue with this is that I plan on making it lock-free (or at least that's the end-goal), hence I'd need an entirely different structure. 

It's okay in typecheck to assign a different *Type value to a Nod's Type field though.

I'd need to know if there are any side-effects to doing so. Just doing...

t = typ(TCMAP)

Inside of OMAKE->TMAP gives an issue of pretty much untyped Type.

C:\Users\theif519\Documents\GitHub\go\src\go\parser\interface.go:12: cannot use make(map[<T>]<T>, int(4), 0) (type map[<T>]<T>) as type map[string]*ast.Object in field value
C:\Users\theif519\Documents\GitHub\go\src\go\parser\interface.go:209: internal compiler error: want MAP, but have map[<T>]<T>

goroutine 1 [running]:
runtime/debug.Stack(0x0, 0x0, 0x0)
        C:/Users/theif519/Documents/GitHub/go/src/runtime/debug/stack.go:24 +0x87
bootstrap/compile/internal/gc.Fatalf(0xa381e0, 0x14, 0xc082aee758, 0x2, 0x2)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/subr.go:165 +0x27d
bootstrap/compile/internal/gc.(*Type).wantEtype(0xc082711d60, 0xc08200a718)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/type.go:717 +0x13b
bootstrap/compile/internal/gc.(*Type).MapType(0xc082711d60, 0x6323bf)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/type.go:187 +0x2d
bootstrap/compile/internal/gc.hmap(0xc082711d60, 0x7)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/reflect.go:174 +0x2c
bootstrap/compile/internal/gc.walkexpr(0xc082a72870, 0xc082aef5b0, 0xc082c49300)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/walk.go:1485 +0xa27a
bootstrap/compile/internal/gc.walkexpr(0xc082c493b0, 0xc082aef5b0, 0x0)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/walk.go:735 +0x1e42
bootstrap/compile/internal/gc.walkstmt(0xc082c493b0, 0xc082a72360)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/walk.go:192 +0x2bb
bootstrap/compile/internal/gc.walkstmtlist(0xc082c25200, 0x25, 0x40)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/walk.go:80 +0x5d
bootstrap/compile/internal/gc.walk(0xc08280f440)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/walk.go:65 +0x377
bootstrap/compile/internal/gc.compile(0xc08280f440)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/pgen.go:391 +0x85b
bootstrap/compile/internal/gc.funccompile(0xc08280f440)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/dcl.go:1287 +0x1c1
bootstrap/compile/internal/gc.Main()
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/main.go:467 +0x20e3
bootstrap/compile/internal/amd64.Main()
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/amd64/galign.go:93 +0x510
main.main()
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/main.go:33 +0x39c

go tool dist: FAILED: C:\Users\theif519\Documents\GitHub\go\pkg\tool\windows_amd64\compile -pack -o C:\Users\theif519\AppData\Local\Temp\go-tool-dist-245681255\go\parser\_go_.a -p go/parser C:\Users\theif519\Documents\GitHub\go\src\go\parser\interface.go C:\Users\theif519\Documents\GitHub\go\src\go\parser\parser.go: exit status 2

I'm assuming I'd have to have it evaluate the parameter types somehow? Could I use substAny on the new type to evaluate it? Like...

t = typ(CMAP)
t = substAny(t, n.List.First().Key(), n.List.First().Value())

Since OMAKE uses the List to determine the parameters passed to it. Pretty much, syntax aside, I use substAny to substitute the key and value types for the new type?
 

Matthew Dempsky

unread,
Jun 14, 2016, 1:37:51 PM6/14/16
to Kyle Stanly, golang-dev
On Tue, Jun 14, 2016 at 10:19 AM, Kyle Stanly <thei...@gmail.com> wrote:
Even if it was originally written in C, there could have been at least some abstraction designed, or at least over the (nearly 3) years, an interface for the map to implement to allow things like this in the future.

The Go programming language itself provides the abstraction layer for maps.  Like Ian points out, the map implementation has been rewritten once, and there have been smaller incremental refactorings along the way too.  These have required compiler+runtime changes, but no changes to user code.

While additional abstractions may have benefited your project, we might also have mispredicted what sort of abstractions would be necessary or appropriate.  And in the mean time, all Go programs that use maps would pay a performance penalty for the additional abstraction layers.

Kyle Stanly

unread,
Jun 14, 2016, 1:44:46 PM6/14/16
to golang-dev, thei...@gmail.com
There have never been any plans to extend the map type, though.  What 
you are doing is an interesting exercise, but I think it's unlikely to 
be added to the language.  Much more likely to be used would a 
concurrent map as a go-gettable package. 

Even if it's just a small chance, I'll take it. I would make a go-gettable, but I need access to low-level atomic primitives like memory barrier (which I see is implemented as mbarrier.go in the runtime package, unexported of course), hashing (which I don't know if it is possible to hash an object outside of Runtime, without having them implement a Hash() interface, which severely limits the usage of the map) without having to rely on the runtime hashmap.go I'm trying to replace, etc.

Matthew Dempsky

unread,
Jun 14, 2016, 1:46:10 PM6/14/16
to Kyle Stanly, golang-dev
On Tue, Jun 14, 2016 at 10:37 AM, Kyle Stanly <thei...@gmail.com> wrote:
It probably is possible, but it might be somewhat involved.  E.g., you at least also need to update cmd/compile/internal/gc/reflect.go's representation of hmap.

It's more involved than adding a new built-in type to Go?

Sorry, I'm not sure how to interpret that question.

The compiler has knowledge of certain types in package runtime, including hmap.  If you look the "hmap" function in cmd/compile/internal/gc/reflect.go, you'll see how the compiler manually constructs the same type representation.  So if you want to change the hmap struct's fields, you'll need to change both package runtime and the compiler.


It's okay in typecheck to assign a different *Type value to a Nod's Type field though.

I'd need to know if there are any side-effects to doing so. Just doing...

t = typ(TCMAP)

Inside of OMAKE->TMAP gives an issue of pretty much untyped Type.

There are other fields you need to populate to construct a new Type value.  See the typMap function in type.go.

If you're going to introduce a new TCMAP type kind, you'll probably also need to add code to handle TCMAPs everywhere that TMAPs are currently handled.

It might be easier for you to just add an extra field to MapType.

Could I use substAny on the new type to evaluate it?

No, substAny is very specialized.  It's purpose is just for rewriting type signatures from builtin/runtime.go, which use the magic "any" type.

Kyle Stanly

unread,
Jun 14, 2016, 2:02:18 PM6/14/16
to golang-dev, thei...@gmail.com
And in the mean time, all Go programs that use maps would pay a performance penalty for the additional abstraction layers.

I agree that adding abstraction will reduce performance quite a bit, especially for those who don't use any other maps besides the default, and I partly understand, but it doesn't seem to be in a maintainable state. When you said it was rewritten, was that when it was converted from C to Go, or some other time? 

Another reason I want to introduce a separate type from the original map (I.E, not just add a new field to hashmap.go unless I have to) is because of this extra abstraction penalty as well. I don't know how much overhead constantly checking fields through an interface would be, but it can't be as good as without, hence by adding my own map that's compatible with the old map's syntax (I.E "m[1] = 1"), then the only overhead is at compile-time. 

While additional abstractions may have benefited your project, we might also have mispredicted what sort of abstractions would be necessary or appropriate

That's understandable as well, however one would have been to allow users to implement their own maps outside of the runtime. At the bare-minimum allow a way to hash any given interface{} outside of the runtime. 

Matthew Dempsky

unread,
Jun 14, 2016, 2:10:51 PM6/14/16
to Kyle Stanly, golang-dev
On Tue, Jun 14, 2016 at 11:02 AM, Kyle Stanly <thei...@gmail.com> wrote:
When you said it was rewritten, was that when it was converted from C to Go, or some other time? 

khr rewrote the hashmap implementation back while it was still in C: https://golang.org/cl/7504044

The port to Go happened later.

Ian Lance Taylor

unread,
Jun 14, 2016, 2:12:40 PM6/14/16
to Kyle Stanly, golang-dev
On Tue, Jun 14, 2016 at 11:02 AM, Kyle Stanly <thei...@gmail.com> wrote:
>
> I agree that adding abstraction will reduce performance quite a bit,
> especially for those who don't use any other maps besides the default, and I
> partly understand, but it doesn't seem to be in a maintainable state. When
> you said it was rewritten, was that when it was converted from C to Go, or
> some other time?

A different time.

I'm sorry you don't consider it maintainable, but it is in fact
maintained. It doesn't seem that bad to me, considering that it is
doing a quite complex job.


> That's understandable as well, however one would have been to allow users to
> implement their own maps outside of the runtime. At the bare-minimum allow a
> way to hash any given interface{} outside of the runtime.

My recollection is that that has been proposed before, but I couldn't
find the discussion. One complex aspect is that we don't want to lock
ourselves into a specific hash function, but a hash function that
changes with each release is much harder for other people to use.

Ian

Kyle Stanly

unread,
Jun 14, 2016, 2:24:09 PM6/14/16
to golang-dev, thei...@gmail.com
There are other fields you need to populate to construct a new Type value.  See the typMap function in type.go.

Oh yeah, I already implemented my own typCMap function, I called it from OTCMAP, which apparently never gets called. So, calling that basically sets the Key and Value, and since the Etype is TMAP, I'm assuming the Key and Val can be extracted directly from it.

At least the call-stack is better than before.

C:\Users\theif519\Documents\GitHub\go\src\go\parser\interface.go:12: cannot use make(map[string]*ast.Object, int(4), 0) (type map[string]*ast.Object) as type map[string]*ast.Object in field value
C:\Users\theif519\Documents\GitHub\go\src\go\parser\interface.go:209: internal compiler error: want MAP, but have map[string]*ast.Object

goroutine 1 [running]:
runtime/debug.Stack(0x0, 0x0, 0x0)
        C:/Users/theif519/Documents/GitHub/go/src/runtime/debug/stack.go:24 +0x87
bootstrap/compile/internal/gc.Fatalf(0xa38120, 0x14, 0xc082b26758, 0x2, 0x2)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/subr.go:165 +0x27d
bootstrap/compile/internal/gc.(*Type).wantEtype(0xc082825a90, 0xc08200a818)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/type.go:717 +0x13b
bootstrap/compile/internal/gc.(*Type).MapType(0xc082825a90, 0x6323bf)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/type.go:187 +0x2d
bootstrap/compile/internal/gc.hmap(0xc082825a90, 0x7)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/reflect.go:174 +0x2c
bootstrap/compile/internal/gc.walkexpr(0xc08260c240, 0xc082b275b0, 0xc082c4e300)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/walk.go:1485 +0xa27a
bootstrap/compile/internal/gc.walkexpr(0xc082c4e3f0, 0xc082b275b0, 0x0)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/walk.go:735 +0x1e42
bootstrap/compile/internal/gc.walkstmt(0xc082c4e3f0, 0xc082611320)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/walk.go:192 +0x2bb
bootstrap/compile/internal/gc.walkstmtlist(0xc082c27200, 0x25, 0x40)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/walk.go:80 +0x5d
bootstrap/compile/internal/gc.walk(0xc082801290)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/walk.go:65 +0x377
bootstrap/compile/internal/gc.compile(0xc082801290)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/pgen.go:391 +0x85b
bootstrap/compile/internal/gc.funccompile(0xc082801290)
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/dcl.go:1287 +0x1c1
bootstrap/compile/internal/gc.Main()
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/gc/main.go:467 +0x20e3
bootstrap/compile/internal/amd64.Main()
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/internal/amd64/galign.go:93 +0x510
main.main()
        C:/Users/theif519/Documents/GitHub/go/src/cmd/compile/main.go:33 +0x39c

go tool dist: FAILED: C:\Users\theif519\Documents\GitHub\go\pkg\tool\windows_amd64\compile -pack -o C:\Users\theif519\AppData\Local\Temp\go-tool-dist-719667311\go\parser\_go_.a -p go/parser C:\Users\theif519\Documents\GitHub\go\src\go\parser\interface.go C:\Users\theif519\Documents\GitHub\go\src\go\parser\parser.go: exit status 2
go/doc


Now the actual type is evaluated, but I'm not sure what it means by it wants MAP, even though it is given a map. I do handle the case for TCMAP is typ, implemented typCMap, and re-assigning the type should change it's Etype too. Guess I might have missed something. Any suggestions? I have like 20+ files open (reflect/*, cmd/compile/internal/gc/*, runtime/*) and it's getting. This is appearing while I am attempting to run all.bat, hence it shouldn't use anything but the original default hashmap, and I only conditionally change the type if there is a third argument, which there shouldn't be. Hence it should compile fine since the changes I made shouldn't effect it.

No, substAny is very specialized.  It's purpose is just for rewriting type signatures from builtin/runtime.go, which use the magic "any" type.

When I finish this and make my proposal, I'll be sure to add that as a documentation contribution. 

It might be easier for you to just add an extra field to MapType.
 
I want a long-term solution rather than a short-term one. Adding a new field would mean adding a lot of runtime abstraction overhead that was avoided when it was designed. Plus, this isn't the only type I plan on adding, hence if I manage to do this, I should be set for the next one. 

Kyle Stanly

unread,
Jun 14, 2016, 2:32:23 PM6/14/16
to golang-dev, thei...@gmail.com
One complex aspect is that we don't want to lock ourselves into a specific hash function, but a hash function that changes with each release is much harder for other people to use. 

It could have been added as an unsafe package. I.E, unsafe.Hash, with a warning that it could potentially change in the future, as it states in the package documentation, "Packages that import unsafe may be non-portable and are not protected by the Go 1 compatibility guidelines." I'd have rather there been a potentially-changing option than no option at all. 

I'm sorry you don't consider it maintainable, but it is in fact 
maintained.  It doesn't seem that bad to me, considering that it is 
doing a quite complex job. 

I don't mean any disrespect, and I won't dispute it's high performance and complexity, its just that it's lacking in documentation and readability.

Kyle Stanly

unread,
Jun 14, 2016, 3:00:14 PM6/14/16
to golang-dev, thei...@gmail.com
I managed to pinpoint it to 'go/ast', the culprit code being...

// NewScope creates a new scope nested in the outer scope.
func NewScope(outer *Scope) *Scope {
const n = 4 // initial scope capacity
return &Scope{outer, make(map[string]*Object, n)}
}

Is there something extra being doing when you have a OPTRLIT Nod? Why is it the only one that evaluates to a third parameter to make? I suppose one way to get around this is to check if the value is greater than 0, but that doesn't explain why it evaluates to three parameters. It correctly compiles a lot of other files. Doing a find across all files for any potential OPTRLIT code, NewScope is the only one which provides a second argument except for the tests (which I assume aren't compiled (???)). Then again, if it did this normally it would throw an error for an extra unused argument, I don't know.

Kyle Stanly

unread,
Jun 14, 2016, 3:16:31 PM6/14/16
to golang-dev
After checking my gmail, it seems Keith Randall stated "Let me add some emphasis to what Ian said.  This change is not going to be added to the language.  There just isn't sufficient demand and the language is frozen.  Your only route is to make a go-gettable package and use it to demonstrate compelling utility and demand.", but I do not see it appear on here. Would this still hold true? I know the chances might be rather slim, but if given a benchmark which proves it to be superior when used concurrently to the original map, would that be enough? I don't believe any go-gettable would be faster than the runtime map, since it needs to use the runtime map anyway to hash it's values, making it rather pointless, and impossible to make lock-free. 

Once again, it won't be a straight commit/dump, I plan to actually provide accurate and extensive benchmarks, profiling, graphs of performance and overhead, etc., proper documentation of it's usage, etc..

Ian Lance Taylor

unread,
Jun 14, 2016, 3:55:25 PM6/14/16
to Kyle Stanly, golang-dev
On Tue, Jun 14, 2016 at 12:16 PM, Kyle Stanly <thei...@gmail.com> wrote:
> After checking my gmail, it seems Keith Randall stated "Let me add some
> emphasis to what Ian said. This change is not going to be added to the
> language. There just isn't sufficient demand and the language is frozen.
> Your only route is to make a go-gettable package and use it to demonstrate
> compelling utility and demand.", but I do not see it appear on here. Would
> this still hold true? I know the chances might be rather slim, but if given
> a benchmark which proves it to be superior when used concurrently to the
> original map, would that be enough? I don't believe any go-gettable would be
> faster than the runtime map, since it needs to use the runtime map anyway to
> hash it's values, making it rather pointless, and impossible to make
> lock-free.
>
> Once again, it won't be a straight commit/dump, I plan to actually provide
> accurate and extensive benchmarks, profiling, graphs of performance and
> overhead, etc., proper documentation of it's usage, etc..

A go-gettable package will show whether there is demand. Without
demand, in my opinion, it won't be added.

Ian

Matthew Dempsky

unread,
Jun 14, 2016, 3:56:51 PM6/14/16
to Kyle Stanly, golang-dev
On Tue, Jun 14, 2016 at 12:16 PM, Kyle Stanly <thei...@gmail.com> wrote:
Would this still hold true? I know the chances might be rather slim, but if given a benchmark which proves it to be superior when used concurrently to the original map, would that be enough?

As I see it, there are two projects you can work on:

1. You can experiment with making changes to the Go programming language.  Some of the details here are subtle/tricky, so I and others have been answering questions about how to implement these changes.  You should not mistake this as support/approval though.  The sorts of changes you've been experimenting with so far are very unlikely to be accepted, as Ian and Keith have already emphasized.

2. You can experiment with alternative hashmap implementations to see if any are faster than or have other benefits over the current one.  If you find one, that would be very interesting and has a chance of being accepted.  However, in this case, it's unnecessary for you to work out all the integration details about how to make it work with the compiler/runtime.  It would be sufficient for you to demonstrate that your own map[string]int or map[int32]int alternative type supports the same abstract operations but more efficiently.

I don't believe any go-gettable would be faster than the runtime map, since it needs to use the runtime map anyway to hash it's values, making it rather pointless, and impossible to make lock-free.

I certainly don't recommend this for production code, but for experimentation/measurement, you can use cmd/compile's //go:linkname directive to access runtime's hash functions:

    package mymap

    import _ "unsafe" // for go:linkname

    //go:linkname strhash runtime.strhash
    func strhash(s *string, seed uintptr) uintptr

    //go:linkname int32hash runtime.memhash32
    func int32hash(i *int32, seed uintptr) uintptr

Matthew Dempsky

unread,
Jun 14, 2016, 4:40:16 PM6/14/16
to Kyle Stanly, golang-dev
On Tue, Jun 14, 2016 at 12:56 PM, Matthew Dempsky <mdem...@google.com> wrote:
1. You can experiment with making changes to the Go programming language.  Some of the details here are subtle/tricky, so I and others have been answering questions about how to implement these changes.  You should not mistake this as support/approval though.  The sorts of changes you've been experimenting with so far are very unlikely to be accepted, as Ian and Keith have already emphasized.

To be clear, I also don't mean to discourage you from experimenting out of personal interest.  Go is an open source project, and you're free to do with your own time as you see fit.  I've certainly done my share of one-off experiments that ultimately weren't accepted, but were still personally satisfying to implement.

I just want to make sure your expectations are appropriately calibrated. :)

Kyle Stanly

unread,
Jun 14, 2016, 4:42:16 PM6/14/16
to golang-dev, thei...@gmail.com
I certainly don't recommend this for production code, but for experimentation/measurement, you can use cmd/compile's //go:linkname directive to access runtime's hash functions:

Okay, 100%, if I knew I could do this from the beginning I definitely would have gone about this approach instead. I suppose I could use that approach for any runtime function as well, correct?  Even even for mbarrier.go? This definitely is easier than hacking together a custom type for the compiler. Adding another field to the map is also viable as well, but if I can access all of runtime's function outside of the runtime, there'd be no need (besides for the lack of the map syntax and semantics)


1. You can experiment with making changes to the Go programming language.  Some of the details here are subtle/tricky, so I and others have been answering questions about how to implement these changes.  You should not mistake this as support/approval though.  The sorts of changes you've been experimenting with so far are very unlikely to be accepted, as Ian and Keith have already emphasized.

2. You can experiment with alternative hashmap implementations to see if any are faster than or have other benefits over the current one.  If you find one, that would be very interesting and has a chance of being accepted.  However, in this case, it's unnecessary for you to work out all the integration details about how to make it work with the compiler/runtime.  It would be sufficient for you to demonstrate that your own map[string]int or map[int32]int alternative type supports the same abstract operations but more efficiently.

Both of these seem like good projects. Keep the changes minimal and with minimal impact and it should be fine right?

Kyle Stanly

unread,
Jun 14, 2016, 4:45:07 PM6/14/16
to golang-dev, thei...@gmail.com
It just makes me rethink some of the things I'll be doing and the end-goal for this project. All-in-all, I'll be continuing experimenting with the map, but leaving the compiler along for the time being.

Matthew Dempsky

unread,
Jun 14, 2016, 5:11:54 PM6/14/16
to Kyle Stanly, golang-dev
On Tue, Jun 14, 2016 at 1:42 PM, Kyle Stanly <thei...@gmail.com> wrote: 
I suppose I could use that approach for any runtime function as well, correct?  Even even for mbarrier.go?

Correct.

Note that to appease "go build", you'll also need to add a dummy empty assembly file (e.g., "empty.s"), otherwise the compiler will give you errors about declaring a function without a body.

Kyle Stanly

unread,
Jun 15, 2016, 9:38:15 AM6/15/16
to golang-dev, thei...@gmail.com
I truly appreciate the help you've given me here. Definitely made my life a whole lot easier, however I have one final question (for now)...

Is there a specific hash function used to hash literally anything reliably? I see 'memhash' in 'hash64.go' and 'hash32.go', which take a unsafe.Pointer and seems to hash it byte by byte without regard for padding in structs and the like, hence I'm assuming its not safe to use for general-purposes. One of the very main reasons I wanted to add my map to the runtime was not only to have access to the runtime functions (Which apparently can be solved with '//go:linkname') but also because 'maptype' contains valuable information that can be used for efficiency reasons, such as the hashing algorithm used. 

I'm being optimistic here, but it seems that the hashing algorithm is stored in it's actual type, and obtaining 'TypeOf' SHOULD theoretically allow me to retrieve it's 'Type' (Reflection implementation of Type), which is also castable to '*rtype', but then again not really because it's unexported. I'm assuming that once accessible, I will have access to it's 'alg' field, which will contain the hashing algorithm used to hash it. 

The issue, of course, being that it's unexported. I'm assuming //go:linkname only works for functions? I suppose it'd be possible create my own function which just takes a 'Type', which obtains it's '*rtype' and return it's 'hash' function it's 'alg' field?

I.E, adding an unexported function like such:

func Hash(obj interface{}, seed uintptr) uintptr {
   t
:= TypeOf(obj).(*rtype)
   
return t.alg.hash(unsafe.Pointer(&obj), seed)
}

But this makes a few assumptions:

1) Do all types implement alg's hash and equals functions?

2) Does a function that does this exist already inside of the runtime?

3) When is the hashing algorithm decided? Compile-time or runtime?

4) Is this safe to do? What types are not hashable, and why are they not hashable?

5) Is there a way to directly access an unexported struct via //go:linkname, this way I can have it compatible and work easily with existing versions of Go. I.E, I would normally be unable to access rtype and typeAlg.

Kyle Stanly

unread,
Jun 15, 2016, 9:49:47 AM6/15/16
to golang-dev, thei...@gmail.com
Actually, now that I write this, I realize how Go got around this issue... it defined a type with the same types and names of fields (rtype and _type) to allow it to be cast. Hence, would it be possible to create my own mirrors of rtype and cast it myself? 

What I mean is, do something like...

// Castable mirror of rtype
type typeMirror struct {
size       uintptr
ptrdata    uintptr
hash       uint32
tflag      tflag
align      uint8
fieldalign uint8
kind       uint8
alg        *typeAlg
gcdata *byte
str    nameOff
_      int32
}

// Mirror for runtime and reflective typeAlg
type typeAlg struct {
hash func(unsafe.Pointer, uintptr) uintptr
equal func(unsafe.Pointer, unsafe.Pointer) bool
}

And obtain the hash function this way...

TypeOf(someObj).(*typeMirror).alg.hash(someObj, seed)

Keith Randall

unread,
Jun 15, 2016, 10:13:39 AM6/15/16
to Kyle Stanly, golang-dev
On Wed, Jun 15, 2016 at 6:49 AM, Kyle Stanly <thei...@gmail.com> wrote:
Actually, now that I write this, I realize how Go got around this issue... it defined a type with the same types and names of fields (rtype and _type) to allow it to be cast. Hence, would it be possible to create my own mirrors of rtype and cast it myself? 

What I mean is, do something like...

// Castable mirror of rtype
type typeMirror struct {
size       uintptr
ptrdata    uintptr
hash       uint32
tflag      tflag
align      uint8
fieldalign uint8
kind       uint8
alg        *typeAlg
gcdata *byte
str    nameOff
_      int32
}

// Mirror for runtime and reflective typeAlg
type typeAlg struct {
hash func(unsafe.Pointer, uintptr) uintptr
equal func(unsafe.Pointer, unsafe.Pointer) bool
}

And obtain the hash function this way...

TypeOf(someObj).(*typeMirror).alg.hash(someObj, seed)

Mostly right, but that cast will fail.  You'll have to go through unsafe.
    var i interface{}
    p := *(**typeMirror)(unsafe.Pointer(&i))
should get you a typeMirror pointer from an interface{}.


On Wednesday, June 15, 2016 at 9:38:15 AM UTC-4, Kyle Stanly wrote:
I truly appreciate the help you've given me here. Definitely made my life a whole lot easier, however I have one final question (for now)...

Is there a specific hash function used to hash literally anything reliably? I see 'memhash' in 'hash64.go' and 'hash32.go', which take a unsafe.Pointer and seems to hash it byte by byte without regard for padding in structs and the like, hence I'm assuming its not safe to use for general-purposes. One of the very main reasons I wanted to add my map to the runtime was not only to have access to the runtime functions (Which apparently can be solved with '//go:linkname') but also because 'maptype' contains valuable information that can be used for efficiency reasons, such as the hashing algorithm used. 

I'm being optimistic here, but it seems that the hashing algorithm is stored in it's actual type, and obtaining 'TypeOf' SHOULD theoretically allow me to retrieve it's 'Type' (Reflection implementation of Type), which is also castable to '*rtype', but then again not really because it's unexported. I'm assuming that once accessible, I will have access to it's 'alg' field, which will contain the hashing algorithm used to hash it. 

The issue, of course, being that it's unexported. I'm assuming //go:linkname only works for functions? I suppose it'd be possible create my own function which just takes a 'Type', which obtains it's '*rtype' and return it's 'hash' function it's 'alg' field?

I.E, adding an unexported function like such:

func Hash(obj interface{}, seed uintptr) uintptr {
   t
:= TypeOf(obj).(*rtype)
   
return t.alg.hash(unsafe.Pointer(&obj), seed)
}

But this makes a few assumptions:

1) Do all types implement alg's hash and equals functions?

No.  For instance, []int is not hashable.  Alg and equals will only be available for types which are allowed as keys in maps.
 
2) Does a function that does this exist already inside of the runtime?
Sort of, see interhash and nilinterhash in alg.go.  You pass them a pointer to an interface{ ... } or an interface{} (respectively).  The easiest route would be to export one of these using linkname.
 

3) When is the hashing algorithm decided? Compile-time or runtime?
Compile time.
 

4) Is this safe to do? What types are not hashable, and why are they not hashable?
The language spec defines this:
The comparison operators == and != must be fully defined for operands of the key type; thus the key type must not be a function, map, or slice. If the key type is an interface type, these comparison operators must be defined for the dynamic key values; failure will cause a run-time panic.
 

5) Is there a way to directly access an unexported struct via //go:linkname, this way I can have it compatible and work easily with existing versions of Go. I.E, I would normally be unable to access rtype and typeAlg.
No, only functions work with linkname.
 

On Tuesday, June 14, 2016 at 5:11:54 PM UTC-4, Matthew Dempsky wrote:
On Tue, Jun 14, 2016 at 1:42 PM, Kyle Stanly <thei...@gmail.com> wrote: 
I suppose I could use that approach for any runtime function as well, correct?  Even even for mbarrier.go?

Correct.

Note that to appease "go build", you'll also need to add a dummy empty assembly file (e.g., "empty.s"), otherwise the compiler will give you errors about declaring a function without a body.

--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Matthew Dempsky

unread,
Jun 15, 2016, 3:56:18 PM6/15/16
to Kyle Stanly, golang-dev
On Wed, Jun 15, 2016 at 6:38 AM, Kyle Stanly <thei...@gmail.com> wrote:
Is there a specific hash function used to hash literally anything reliably?

If you want a generic hash function, I'd suggest using nilinterhash.  That way all you need to do is convert your value to interface{}, and the compiler/runtime take care of all the runtime type handling logic.

5) Is there a way to directly access an unexported struct via //go:linkname, this way I can have it compatible and work easily with existing versions of Go. I.E, I would normally be unable to access rtype and typeAlg.

Only functions and variables can be used with //go:linkname.

However, it's entirely up to you to make sure the parameter types match, and there are already packages that duplicate types from package runtime so that they can share complex data structures.  For example, reflect.rtype and time.runtimeTimer mirror runtime._type and runtime.timer, respectively.

Kyle Stanly

unread,
Jun 16, 2016, 8:13:48 AM6/16/16
to golang-dev, thei...@gmail.com
Thank you, Keith, Matthew, and Ian, for all of the amazing and helpful responses and answering my questions. 
Reply all
Reply to author
Forward
0 new messages