Serialization time in C++ using protobuf

1,819 views
Skip to first unread message

ShirishKul

unread,
Mar 19, 2009, 1:24:14 AM3/19/09
to Protocol Buffers
I used protobuf to serialize an object in C++. The size of binary was
around 300 KB and time taken was 1359.4098 milliseconds. I wonder why
it took so much of time at C++ , where as, at java side -
serialization of similar object took 39.62626263 milliseconds.

I've seen SerializeToOSteam took around 1.28 seconds of the time in
case of C++.

I'm i missing something? Any pointers to this would be highly
appreciated.

Thanks,
Shirish

Kenton Varda

unread,
Mar 19, 2009, 1:41:57 AM3/19/09
to ShirishKul, Protocol Buffers
First, are you using:

  option optimize_for = SPEED;

?  If not, add that line to your .proto file.  But even without that option, the speed shouldn't be that slow.  Maybe you can run in a profiler to see what's taking so long?

Are you writing to an in-memory buffer or some sort of output stream?  Is it possible that the stream is blocking?

For reference, most of my C++ serialization benchmarks get around 250-500 MB/s with optimize_for = SPEED and 15-25 MB/s without it.

ShirishKul

unread,
Mar 19, 2009, 2:23:51 AM3/19/09
to Protocol Buffers
Hi Kenton,

Thanks for pointing this out...

I have not used option optimize_for = SPEED;

After adding this option I was able to get 102.469168 milliseconds.
There seems to be quite a drastic change with usage of optimize_for =
SPEED option.

Output in my case is a file to which i'm writing.

But again, this doesn't seem to be matching with what your throughput
is. Do I still need to make changes to seek more optimum results?

What is the intent to keep the SPEED optimization thing as the option?
What is other benefit if we are not using this option? Any ideas?

Regards,
Shirish
> > Shirish- Hide quoted text -
>
> - Show quoted text -

ShirishKul

unread,
Mar 19, 2009, 2:37:56 AM3/19/09
to Protocol Buffers
Ok. Finally I was able to get 46.8804 milliseconds. This should be
pretty sufficient for file size 300 KB generated by serializing the
output. Thanks again.

Regards,
Shirish
> > - Show quoted text -- Hide quoted text -

Kenton Varda

unread,
Mar 19, 2009, 12:51:26 PM3/19/09
to ShirishKul, Protocol Buffers
On Wed, Mar 18, 2009 at 11:23 PM, ShirishKul <shiri...@gmail.com> wrote:
Output in my case is a file to which i'm writing.

But again, this doesn't seem to be matching with what your throughput
is. Do I still need to make changes to seek more optimum results?

What happens if you serialize to an in-memory buffer?  For example, try:

  string data;
  my_message.SerializeToString(&data);

Is this faster?  If so, then the time is being spent writing to disk, not serializing the message, and protocol buffers can't help you with that.
 
What is the intent to keep the SPEED optimization thing as the option?
What is other benefit if we are not using this option? Any ideas?

The other (default) is CODE_SIZE.  In SPEED mode, a *lot* of code is generated, and it can make your binaries very big.  We have had problems with binaries getting way too big at Google.  That said, I'm planning to make SPEED the default in the next release.
Reply all
Reply to author
Forward
0 new messages