Obtaining an efficient hash for Go values

203 views
Skip to first unread message

Arnaud Delobelle

unread,
Dec 24, 2020, 5:18:47 AM12/24/20
to golang-nuts
Hi there!

In my continued efforts to improve the performance of my Go Lua implementation [1], I have reached a bottleneck which causes me a quandary.

Lua has a data structure which is called 'table', which is essentially a hashmap.  So far I have implemented it as a Go map, which works OK.  However there is significant overhead coming from the fact that Lua has a `next` function that allows getting the "next" key-value pair in a table after a given one: `next(t, key)`.  As far as I can tell Go doesn't allow this so if I want to use a Go map, I also have to keep track of the next key for each key, which doubles the memory requirement, necessitates more accounting in the code and makes iteration via `next` slower.

So I am looking at not using the builtin Go map and making my own hashtable implementation.  However, because the keys are still made of Go values, I would like to benefit from the quick hashing that maps use.  After some poking around in the implementation of map (and discovering the //go:linkname compiler directive), I think that I can do this:


// This is the Lua Value type.  The scalar part contains the payload of int64, float64 or bool for quicker access and minimising allocations.
type Value struct {
    scalar uint64
    iface interface{}
}


//go:linkname goRuntimeInt64Hash runtime.int64Hash
//go:noescape
func goRuntimeInt64Hash(i uint64, seed uintptr) uintptr

//go:linkname goRuntimeEfaceHash runtime.efaceHash
//go:noescape
func goRuntimeEfaceHash(i interface{}, seed uintptr) uintptr

// Hash returns a hash for the value.
func (v Value) Hash() uintptr {
    if v.scalar != 0 {
            return goRuntimeInt64Hash(v.scalar, 0)
    }
    return goRuntimeEfaceHash(v.iface, 0)
}

Does that sound like a sensible approach?  I.e. is it safe enough to use the go:linkname directive, and do those seem like the right functions to call to obtain a good hash?

TIA

-- 
Arnaud

Dan Kortschak

unread,
Dec 24, 2020, 5:31:54 AM12/24/20
to golan...@googlegroups.com
You can also use the internal map implementation and make us of the
runtime's map iterator. This is relatively straightforward at the cost
of vigilance for changes in the runtime.

Here is an example (note that yo need a .S file as well to get the
go:linkname magic to work).
https://github.com/gonum/gonum/blob/master/graph/iterator/map.go

(we back this with a reflect equivalent that it provided by
https://golang.org/pkg/reflect/#MapIter.

Dan
> --
> You received this message because you are subscribed to the Google
> Groups "golang-nuts" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to golang-nuts...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/golang-nuts/79e3f124-037c-4c10-a7b6-42f496bd26b6n%40googlegroups.com
> .



Axel Wagner

unread,
Dec 24, 2020, 5:45:30 AM12/24/20
to Arnaud Delobelle, golang-nuts
On Thu, Dec 24, 2020 at 11:19 AM Arnaud Delobelle <arn...@gmail.com> wrote:
Does that sound like a sensible approach?  I.e. is it safe enough to use the go:linkname directive, and do those seem like the right functions to call to obtain a good hash?

It is probably better to use hash/maphash, if you can.
It's the intended use-case for that package and not subject to changes in the runtime. However, the downside is that it doesn't allow all the same values to be used as keys as Go maps do (you have to do the mapping from values to []byte manually), so it might not be suitable.
 
Personally, I feel that go:linkname and similar trickery is too much black magic to actually use. But TBF, I might be too biased on cleanliness over performance and other concerns.

Arnaud Delobelle

unread,
Dec 24, 2020, 9:12:00 AM12/24/20
to golang-nuts

(I am replying inline to make the context of my comment clear, although it seems common practice to top-post in this forum - is there any guidance about this?)

Arnaud

On Thursday, 24 December 2020 at 10:31:54 UTC kortschak wrote:
You can also use the internal map implementation and make us of the
runtime's map iterator. This is relatively straightforward at the cost
of vigilance for changes in the runtime.

Here is an example (note that yo need a .S file as well to get the
go:linkname magic to work).
https://github.com/gonum/gonum/blob/master/graph/iterator/map.go


That's interesting, thanks.  Although I don't think it fits my use case because I need something stronger than an iterator: a function that given a table and a key returns the next key in the table (as this is a public API that Lua provides, see https://www.lua.org/manual/5.3/manual.html#pdf-next).  From my point of view this is an unfortunate API!  Nevertheless it is what it is...  I haven't found even a hacky way to obtain that for a Go map.  That is why I think I need to make my own hashtable implementation

Arnaud Delobelle

unread,
Dec 24, 2020, 9:19:53 AM12/24/20
to golang-nuts
I did have a look at hash/maphash, but I didn't think it was the right shaped API for what I am doing.  In short, I would have to contrive to provide words to the package as byte slices, which then would turn them back into words under the hood.  That feels like a superfluous round trip, given that the table is at the heart of everything in Lua as it is the only data structure available (so it is used to implement objects, modules, namespaces...).  I haven't discarded it yet but I wanted to explore other possibilities before doing some benchmarking.

Arnaud

Jan Mercl

unread,
Dec 24, 2020, 9:31:22 AM12/24/20
to Arnaud Delobelle, golang-nuts
On Thu, Dec 24, 2020 at 3:12 PM Arnaud Delobelle <arn...@gmail.com> wrote:
> That's interesting, thanks. Although I don't think it fits my use case because I need something stronger than an iterator: a function that given a table and a key returns the next key in the table (as this is a public API that Lua provides, see https://www.lua.org/manual/5.3/manual.html#pdf-next). From my point of view this is an unfortunate API! Nevertheless it is what it is... I haven't found even a hacky way to obtain that for a Go map. That is why I think I need to make my own hashtable implementation

The linked pdf seems to indicate that Lua's 'next' can be implemented
using Go range over a map with no problem because the iteration order
is undefined in Lua as well.

What am I missing?

Arnaud Delobelle

unread,
Dec 24, 2020, 10:07:46 AM12/24/20
to golang-nuts
The order is undefined but stable (provided you don't insert new values), and accessible via the `next` function.  Here is an example:

$ lua
Lua 5.3.5  Copyright (C) 1994-2018 Lua.org, PUC-Rio
> t = { x = 1, y = 2, z = 3 }
> next(t, 'y')
z       3
> next(t, 'z')
x       1
> next(t, 'y')
z       3
> next(t, 'z')
x       1

I don't understand how to do that with a map iterator unless get a new iterator each time and consume it till I get the key I want to find the next key for (which would be very inefficient), or there is a way to initialise the iterator at a given key - but I haven't found how to do the latter.

Arnaud

 

Axel Wagner

unread,
Dec 24, 2020, 10:31:20 AM12/24/20
to Arnaud Delobelle, golang-nuts
I think theoretically, you might be able to optimize it away (at least in the overwhelming majority of use-cases) by storing a couple of `*reflect.MapIter` for the last used tables in the context of the VM and then check, when `next` is called, if the key passed is the one it's currently at. But I agree, that it's pretty hackish and also doesn't give you the "stable, unless the elements are changed" property.

So, I tend to agree that you won't be able to do without your own hashtable. Which is a shame, as the builtin map is pretty good.

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.

Wojciech S. Czarnecki

unread,
Dec 24, 2020, 10:51:49 AM12/24/20
to golan...@googlegroups.com
Dnia 2020-12-24, o godz. 06:19:53
Arnaud Delobelle <arn...@gmail.com> napisał(a):

> I did have a look at hash/maphash, but I didn't think it was the right
> shaped API for what I am doing.

Do not be afraid to just "steal" runtime implementation and tinker with it to
get at desired properties.

https://github.com/golang/go/blob/master/src/runtime/map.go
https://dave.cheney.net/2018/05/29/how-the-go-runtime-implements-maps-efficiently-without-generics

Hope this helps,

--
Wojciech S. Czarnecki
<< ^oo^ >> OHIR-RIPE

Jan Mercl

unread,
Dec 24, 2020, 10:55:27 AM12/24/20
to Arnaud Delobelle, golang-nuts
On Thu, Dec 24, 2020 at 4:08 PM Arnaud Delobelle <arn...@gmail.com> wrote:

> The order is undefined but stable (provided you don't insert new values), and accessible via the `next` function. Here is an example:

Quoting from the provided link:

""""
next (table [, index])

Allows a program to traverse all fields of a table. Its first argument
is a table and its second argument is an index in this table. next
returns the next index of the table and its associated value. When
called with nil as its second argument, next returns an initial index
and its associated value. When called with the last index, or with nil
in an empty table, next returns nil. If the second argument is absent,
then it is interpreted as nil. In particular, you can use next(t) to
check whether a table is empty.

The order in which the indices are enumerated is not specified, even
for numeric indices. (To traverse a table in numerical order, use a
numerical for.)

The behavior of next is undefined if, during the traversal, you assign
any value to a non-existent field in the table. You may however modify
existing fields. In particular, you may clear existing fields.
""""

The word 'stable' does not seem to be present in this part of the
specs at all, so only "The order in which the indices are enumerated
is not specified, even for numeric indices." remains to be considered.
That said, I see no problem with implementing it by iterating a Go
map.

Or is the Lua specification perhaps incomplete? Maybe people
incorrectly rely on the behavior of a particular implementation.
That's why the Go map iteration is intentionally randomized, BTW.

Arnaud Delobelle

unread,
Dec 24, 2020, 11:09:43 AM12/24/20
to golang-nuts
Lua `next` is a function that takes two arguments: a table and a key and returns the another key (or nil).  Unlike an iterator, it holds no context.  It is guaranteed that you will iterate over all keys if you repeatedly call next as follows

k = next(t)
while k ~= nil do
    k = next(t, k)
end

If, however, you were to insert new items into t in the body of this loop, this guaranteed would no longer hold.  This is what is meant by "undefined behaviour", as far as I understand.

Arnaud
Reply all
Reply to author
Forward
0 new messages