pkg/encoding - Managing cycles in encoded data

138 views
Skip to first unread message

Brigham Toskin

unread,
Aug 25, 2014, 8:03:33 PM8/25/14
to golan...@googlegroups.com
From the godoc for encoding/gob as of 1.3.1: "Recursive types work fine, but recursive values (data with cycles) are problematic. This may change."

I didn't see any threads in golang-dev, roadmap items for 1.4, or issues in the google code tracker for this, but it seems to be an issue that the library writers knew about. I ran across this myself while marshaling to json, though only the docs for encoding/gob seem to make a passing reference to the limitation. In the following trivial example, it builds with no errors, but comes back with [process took too long] in the playground and fatal error: stack overflow if run locally, because it recurses infinitely when it follows the Next pointerhttp://play.golang.org/p/4mAZ8Ao-tm. (Note: gob.Encoder.Encode does seem to leverage some cycle detection code in type.go for recursive pointer types, but not for recursive data.)

Is there a concrete plan to deal with this? I'd be willing to submit a patch for code review; I'll even try to crank it out and test it before the 1.4 lockdown, if I can find the time.

Without having gone into the code in any depth, it seems like the simplest solution would be to store pointers in a map set (a linear slice search could suck if we are encoding e.g. a very, very large linked list), and do something more reasonable than naïvely blowing up the stack when we go to process previously-visited nodes. I'm not sure if there's a general solution to serializing arbitrary data that will work in all cases without doing weird things with the output format (see e.g. Lua serialization patterns), and this may depend on what format we're encoding to, but we could at least return nil, err if there's nothing sane to do.

This wouldn't technically break compatibility with previous 1.x builds, since it addresses a bug that would have kept go programs from ever working in the first place. But, some thought should go into it, because it will set expectations of future programs and what we can change later.

Thoughts? Feedback? Ideas for how to handle this without returning an error? Or is this already being handled?


Brigham Toskin

Rob Pike

unread,
Aug 25, 2014, 8:18:50 PM8/25/14
to Brigham Toskin, golan...@googlegroups.com
It needs to be done without allocation and hence garbage, which is the
main reason I've resisted it. Printf has the same problem. It also
needs to work, which in gob is really not easy.

Even an error check is tricky, but I'd be willing to look at code,
speculatively. Tell me your plan before sending a CL though.

-rob

Brigham Toskin

unread,
Aug 25, 2014, 8:27:55 PM8/25/14
to Rob Pike, golan...@googlegroups.com
Okay, so it looks like what I should be looking for is a more general solution for cycle detection, rather than a quick patch for the encoders. Maybe this is something that ought to be in the runtime, then?
--
Brigham Toskin
Reply all
Reply to author
Forward
0 new messages