MarshalJSON vs UUIDs

1,038 views
Skip to first unread message

Albert Strasheim

unread,
Aug 29, 2011, 11:54:20 AM8/29/11
to golang-dev
Hello all

We have an internal uuid package in our code:

type Uuid []byte

type UuidKey string

func (uuid Uuid) Key() UuidKey {
return UuidKey([]byte(uuid))
}

We made this UuidKey thing so that we could have map[UuidKey]whatever.

This has been working nicely until today, when we decided that we
wanted to marshal some internal state structs with maps with UuidKey
keys as JSON.

The documentation says:

Map values encode as JSON objects. The map's key type must be string;
the object keys are used directly as map keys.

In other words, even if we define a MarshalJSON function on UuidKey,
it isn't called.

This causes func (e *encodeState) string(s string) to return an
InvalidUTF8Error.

We could use the real string representation of UUIDs as keys, but we
call Key() frequently in performance-sensitive code (big maps with
lots of UUIDs).

Perhaps our String() function could be optimized, but this doesn't
seem ideal either...

func (uuid Uuid) String() string {
if uuid == nil {
return "<nil>"
}
if len(uuid) != 16 {
panic("invalid uuid: not 16 bytes")
}
return fmt.Sprintf("%x-%x-%x-%x-%x", []byte(uuid[0:4]),
[]byte(uuid[4:6]), []byte(uuid[6:8]), []byte(uuid[8:10]),
[]byte(uuid[10:]))
}

Any thoughts on a way forward would be appreciated.

Regards

Albert

P.S. I've been thinking of trying

type Uuid string

instead, so that we can get rid of UuidKey, but this still doesn't
solve the problem with JSON.

Andrew Gerrand

unread,
Aug 29, 2011, 7:44:14 PM8/29/11
to Albert Strasheim, golang-dev

1. I think "type Uuid string" is a better idea, because they're
(presumably) immutable, and the []byte to string conversion does an
allocation which you want to avoid in performance critical code.

2. As far as JSON goes, what's so bad about map[string]whatever
instead of map[Uuid]whatever? Yes, you lose the type annotation and it
requires a little more conversion in some places, but it seems like a
fine compromise for now.

3. JSON should probably handle maps with key types that have an
underlying string type.

Andrew

Russ Cox

unread,
Aug 29, 2011, 7:54:22 PM8/29/11
to Andrew Gerrand, Albert Strasheim, golang-dev
> 3. JSON should probably handle maps with key types that have an
> underlying string type.

It does already. The problem is that JSON requires that
key values be Unicode strings, which in Go means that
they must be valid UTF-8 encoded strings. The UUIDs
are raw 8-bit data and almost never coincide with valid
UTF-8, so they cannot be encoded as JSON strings.

If the map keys were the hex format of the UUID then this
would be just fine. My suggestion would be to try this and
measure the performance hit before trying something more
complex.

Russ

Albert Strasheim

unread,
Aug 30, 2011, 1:50:38 AM8/30/11
to r...@golang.org, Andrew Gerrand, golang-dev
Hello

On Tue, Aug 30, 2011 at 1:54 AM, Russ Cox <r...@golang.org> wrote:
>> 3. JSON should probably handle maps with key types that have an
>> underlying string type.
> It does already.  The problem is that JSON requires that
> key values be Unicode strings, which in Go means that
> they must be valid UTF-8 encoded strings.  The UUIDs
> are raw 8-bit data and almost never coincide with valid
> UTF-8, so they cannot be encoded as JSON strings.

How about checking if the key implements the Marshaler interface?

It might be a bit of an esoteric use case, but I can see that other
people might want to transform their strings when
marshalling/unmarshalling to be compatible with some other system.

> If the map keys were the hex format of the UUID then this
> would be just fine.  My suggestion would be to try this and
> measure the performance hit before trying something more
> complex.

I'll post the complete package with some benchmarks later today, but
the formatting overhead is quite significant.

Regards

Albert

Russ Cox

unread,
Aug 30, 2011, 7:41:28 AM8/30/11
to Albert Strasheim, Andrew Gerrand, golang-dev
>> It does already.  The problem is that JSON requires that
>> key values be Unicode strings, which in Go means that
>> they must be valid UTF-8 encoded strings.  The UUIDs
>> are raw 8-bit data and almost never coincide with valid
>> UTF-8, so they cannot be encoded as JSON strings.
>
> How about checking if the key implements the Marshaler interface?

There's a slight mismatch here. The normal Marshaler interface
returns arbitrary JSON. Object keys are JSON strings. So we
could call Marshal and then error out if it returns something
other than a string. I guess that would be okay.

Russ

Albert Strasheim

unread,
Aug 30, 2011, 7:53:54 AM8/30/11
to r...@golang.org, Andrew Gerrand, golang-dev
Hello

Thanks, I'll try to prepare a patch soon.

In the mean time, we've worked around it by implementing the Marshaler
interface on our map type.

At that point we go through the effort of creating a map with proper
string keys and we then return the result of Marshal on that
"well-formed" map.

This is a lot of work, but it doesn't happen in a performance-critical
part of the code.

At the same time, it doesn't slow down the normal use of UUIDs in
maps. I'm going to try UUIDs as strings anyway to get rid of the
UuidKey wart though.

Regards

Albert

Reply all
Reply to author
Forward
0 new messages