gob decode: limit string length

1,499 views
Skip to first unread message

Roberto Zanotto

unread,
Sep 30, 2015, 1:29:42 PM9/30/15
to golang-nuts
Hi.
I would like to set a limit to the length of the string when decoding one. Should I decode to an array of bytes? Are there other options?
Cheers.

Giulio Iotti

unread,
Sep 30, 2015, 3:03:55 PM9/30/15
to golang-nuts
On Wednesday, September 30, 2015 at 8:29:42 PM UTC+3, Roberto Zanotto wrote:
Hi.
I would like to set a limit to the length of the string when decoding one. Should I decode to an array of bytes? Are there other options?

How can you decode to an array of bytes if it was encoded as a string?

Depending on how you want to handle the strings over limit: use a LimitedReader and the decoder will fail with unexpected EOF; truncate the string after decoding. Gob has a limit at 1GB but it's not configurable (yet).

-- 
Giulio Iotti


Roberto Zanotto

unread,
Sep 30, 2015, 3:40:33 PM9/30/15
to golang-nuts
I'll provide a little more context. I have a server and I want to expose a public API based on gob over TLS. I want to avoid the situation where malicious clients can force the server to decode and possibly have in memory giant strings. I can define the protocol, so I could decide that strings have to be encoded/decoded as UTF8 byte slices, but it doesn't seem really nice. I guess I can't wrap the connection with a LimitedReader.

James Aguilar

unread,
Sep 30, 2015, 10:05:08 PM9/30/15
to golang-nuts
Limit reader + http://www.checkupdown.com/status/E413.html or its moral equivalent seems like the right thing.

Roberto Zanotto

unread,
Sep 30, 2015, 11:28:10 PM9/30/15
to golang-nuts
The thing is that I would like to limit the size of a single request, not the size of the whole stream/connection. But thanks for the answers to both of you, I understand that the problem can not be addressed by gob as it is now. Maybe I can accomplish what I want with a customized LimitedReader.

Do you think it might be worth adding to gob the ability to recognize some struct field tags to limit the size of the data to be decoded/encoded? It seems like it could be useful for various data types (slices, bigints, strings...).

Rob Pike

unread,
Oct 1, 2015, 12:20:46 AM10/1/15
to Roberto Zanotto, golang-nuts
It's trivial to break the stream into messages without parsing their contents, since each message starts with a byte count. It would be very easy to wrap the I/O going to gob and shut down any connection that sends a message with a count > N for some N. That is, it would be easy to do this outside of the gob package. Yay interfaces.

You don't want to limit just strings, so this is a better answer anyway. The package already does this internally, but the size it accepts is quite large.

-rob


On Wed, Sep 30, 2015 at 8:28 PM, Roberto Zanotto <roby...@gmail.com> wrote:
The thing is that I would like to limit the size of a single request, not the size of the whole stream/connection. But thanks for the answers to both of you, I understand that the problem can not be addressed by gob as it is now. Maybe I can accomplish what I want with a customized LimitedReader.

Do you think it might be worth adding to gob the ability to recognize some struct field tags to limit the size of the data to be decoded/encoded? It seems like it could be useful for various data types (slices, bigints, strings...).

--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Roberto Zanotto

unread,
Oct 1, 2015, 11:04:47 AM10/1/15
to golang-nuts, roby...@gmail.com
Thanks for the suggestion.
I'll leave some detail here for posterity.

In case a single goroutine takes care of the decoding the problem can be solved with a io.LimitedReader. Just (re)set the max number of bytes before each call to Decode. This approach is nice because it is very trivial, it is less nice because it breaks with concurrent goroutines and because the resetting of the byte count is not encapsulated inside the ReadWriter interface.

The approach Rob suggested solves the concurrency and encapsulation problems nicely, but requires a (very limited) understanding of gob's format.
From gob documentation:
Finally, each message created by a call to Encode is preceded by an encoded unsigned integer count of the number of bytes remaining in the message. After the initial type name, interface values are wrapped the same way; in effect, the interface value acts like a recursive invocation of Encode.

How an unsigned integer is encoded:
An unsigned integer is sent one of two ways. If it is less than 128, it is sent as a byte with that value. Otherwise it is sent as a minimal-length big-endian (high byte first) byte stream holding the value, preceded by one byte holding the byte count, negated. Thus 0 is transmitted as (00), 7 is transmitted as (07) and 256 is transmitted as (FE 01 00).

Function to decode an uint:

Roberto Zanotto

unread,
Oct 8, 2015, 10:45:48 AM10/8/15
to golang-nuts
I implemented the LimitedGobReader, seems to be working fine. Just a small question: when should my Read return?
gob calls Read with a buffer size of 4096, should I try to fill it as much as I can or should I return each gob message on the stream separately? The latter seems better to me, as I don't know when my reads to the underlying Reader block, but I'd appreciate the opinion of someone who has experience with Readers.

Roberto Zanotto

unread,
Oct 8, 2015, 11:23:49 AM10/8/15
to golang-nuts
Found:
If some data is available but not len(p) bytes, Read conventionally returns what is available instead of waiting for more.

Giulio Iotti

unread,
Oct 8, 2015, 11:47:46 AM10/8/15
to golang-nuts
On Thursday, October 8, 2015 at 6:23:49 PM UTC+3, Roberto Zanotto wrote:
Found:
If some data is available but not len(p) bytes, Read conventionally returns what is available instead of waiting for more.

On the other hand, if you need less data than might be available use a buffered reader to minimize the context switches with the kernel (the actual reads).

-- 
Giulio Iotti 

Roberto Zanotto

unread,
Oct 8, 2015, 12:26:20 PM10/8/15
to golang-nuts
I already use a buffered reader, because I need to Peek and gob Decoder requires a buffered reader anyway ;)
Reply all
Reply to author
Forward
0 new messages