Message limit

2707 views
Skip to first unread message

Delip Rao

unread,
Jan 12, 2010, 11:41:26 AM1/12/10
to Protocol Buffers
Hi,

I'm trying to understand protobuf message size limits. Is the 64M
message limit fixed or can it be changed via some compile option? If I
have a message Foo defined as:

message Foo {
repeated Bar bars = 1;
}

Will the limit apply to Foo or just the individual Bars?

Thanks,
Delip

Jason Hsueh

unread,
Jan 12, 2010, 12:40:33 PM1/12/10
to Delip Rao, Protocol Buffers
The limit applies to the data source from which a message is parsed. So if you want to parse a serialization of Foo, it applies to Foo. But if you parse a bunch of Bar messages one by one, and add them individually to Bar, then the limit only applies to each individual Bar.

You can change the limit in your code if you create your own CodedInputStream and call its SetTotalBytesLimit method in C++, or its Java equivalent setSizeLimit.

--
You received this message because you are subscribed to the Google Groups "Protocol Buffers" group.
To post to this group, send email to prot...@googlegroups.com.
To unsubscribe from this group, send email to protobuf+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/protobuf?hl=en.




Kenton Varda

unread,
Jan 12, 2010, 2:28:54 PM1/12/10
to Jason Hsueh, Delip Rao, Protocol Buffers
But you should consider a design that doesn't require you to send enormous messages.  Protocol buffers are not well-optimized for this sort of use.  For data stored on disk, consider storing multiple records in a RecordIO file.  For data passed over Stubby, consider streaming it in multiple pieces.

Kenton Varda

unread,
Jan 12, 2010, 2:29:41 PM1/12/10
to Jason Hsueh, Delip Rao, Protocol Buffers
Dang it, I got my mailing lists mixed up and referred to some things we haven't released open source.  Sigh.

Kenton Varda

unread,
Jan 12, 2010, 2:35:24 PM1/12/10
to Jason Hsueh, Delip Rao, Protocol Buffers
So to rephrase what I said:  You should break up your message in multiple pieces that you store / send one at a time.  Usually very large messages are actually lists of smaller messages, so instead of using one big repeated field, store each message separately.  When storing to a file, it's probably advantageous to use a "framing" format that lets you store multiple "records" such that you can seek to any particular record quickly -- using a large repeated field doesn't provide this anyway, so you need something else (we have some code internally that we call RecordIO).

BTW, we would love to open source the libraries I mentioned, it's just a matter of finding the time to get it done.

Delip Rao

unread,
Jan 13, 2010, 8:19:57 AM1/13/10
to Kenton Varda, Jason Hsueh, Protocol Buffers
Thanks folks, that was very useful. Right now I have sequence of
messages since we're processing serially. RecordIO seems like a great
idea. Is the "framing format" just multiple messages in a file with an
inverted index in the beginning?

- Delip

Kenton Varda

unread,
Jan 13, 2010, 2:30:54 PM1/13/10
to Delip Rao, Jason Hsueh, Protocol Buffers
I actually don't know how the format works, but that is certainly one possibility.
Reply all
Reply to author
Forward
0 new messages